Overview

Brought to you by YData

Dataset statistics

Number of variables49
Number of observations289628
Missing cells6084825
Missing cells (%)42.9%
Total size in memory108.3 MiB
Average record size in memory392.0 B

Variable types

Text49

Dataset

DescriptionNaturalis Biodiversity Center (NL) - Aves 0061686-241126133413365
URLhttps://doi.org/10.15468/dl.u5tv27

Alerts

license has constant value "CC0 1.0" Constant
rightsHolder has constant value "Naturalis Biodiversity Center" Constant
institutionID has constant value "https://ror.org/0566bfb96" Constant
collectionCode has constant value "Aves" Constant
associatedTaxa has constant value "has parasite: Cirrophthirius cf. recurvirostrae | Quadraceps sp." Constant
locationAccordingTo has constant value "45.0083" Constant
locationRemarks has constant value "128.0083" Constant
geodeticDatum has constant value "WGS84" Constant
namePublishedInID has constant value "Crossoptilon mantchuricum Swinhoe" Constant
namePublishedIn has constant value "Animalia" Constant
namePublishedInYear has constant value "Animalia" Constant
kingdom has constant value "Animalia" Constant
tribe has constant value "Crossoptilon" Constant
subgenus has constant value "mantchuricum" Constant
nomenclaturalCode has constant value "ICZN" Constant
recordNumber has 276338 (95.4%) missing values Missing
recordedBy has 92827 (32.1%) missing values Missing
individualCount has 30538 (10.5%) missing values Missing
sex has 98166 (33.9%) missing values Missing
lifeStage has 206842 (71.4%) missing values Missing
associatedTaxa has 289625 (> 99.9%) missing values Missing
eventDate has 74040 (25.6%) missing values Missing
verbatimEventDate has 59530 (20.6%) missing values Missing
island has 200031 (69.1%) missing values Missing
country has 45132 (15.6%) missing values Missing
stateProvince has 136488 (47.1%) missing values Missing
locality has 78963 (27.3%) missing values Missing
verbatimElevation has 287041 (99.1%) missing values Missing
locationAccordingTo has 289627 (> 99.9%) missing values Missing
locationRemarks has 289627 (> 99.9%) missing values Missing
decimalLatitude has 136554 (47.1%) missing values Missing
decimalLongitude has 135979 (46.9%) missing values Missing
coordinateUncertaintyInMeters has 287974 (99.4%) missing values Missing
typeStatus has 286162 (98.8%) missing values Missing
identifiedBy has 289216 (99.9%) missing values Missing
dateIdentified has 289371 (99.9%) missing values Missing
namePublishedInID has 289627 (> 99.9%) missing values Missing
namePublishedIn has 289627 (> 99.9%) missing values Missing
namePublishedInYear has 289627 (> 99.9%) missing values Missing
class has 286898 (99.1%) missing values Missing
order has 287366 (99.2%) missing values Missing
family has 74054 (25.6%) missing values Missing
tribe has 289627 (> 99.9%) missing values Missing
subgenus has 289627 (> 99.9%) missing values Missing
infraspecificEpithet has 89169 (30.8%) missing values Missing
scientificNameAuthorship has 17143 (5.9%) missing values Missing
gbifID has unique values Unique
occurrenceID has unique values Unique
catalogNumber has unique values Unique

Reproduction

Analysis started2025-01-14 15:45:13.138864
Analysis finished2025-01-14 15:45:20.961624
Duration7.82 seconds
Software versionydata-profiling vv4.12.1
Download configurationconfig.json

Variables

gbifID
Text

Unique 

Distinct289628
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size2.2 MiB
2025-01-14T10:45:21.228527image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters2896280
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique289628 ?
Unique (%)100.0%

Sample

1st row2434047501
2nd row2434047502
3rd row2434047503
4th row2434047504
5th row2434047505
ValueCountFrequency (%)
2434047501 1
 
< 0.1%
2433858683 1
 
< 0.1%
2434047506 1
 
< 0.1%
2434047507 1
 
< 0.1%
2434047508 1
 
< 0.1%
2434047523 1
 
< 0.1%
2434047509 1
 
< 0.1%
2433858690 1
 
< 0.1%
2433858838 1
 
< 0.1%
2434047504 1
 
< 0.1%
Other values (289618) 289618
> 99.9%
2025-01-14T10:45:21.586994image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4 645268
22.3%
3 506883
17.5%
2 475626
16.4%
1 243866
 
8.4%
0 212854
 
7.3%
9 194666
 
6.7%
8 173529
 
6.0%
7 150795
 
5.2%
5 148418
 
5.1%
6 144375
 
5.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2896280
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 645268
22.3%
3 506883
17.5%
2 475626
16.4%
1 243866
 
8.4%
0 212854
 
7.3%
9 194666
 
6.7%
8 173529
 
6.0%
7 150795
 
5.2%
5 148418
 
5.1%
6 144375
 
5.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2896280
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
4 645268
22.3%
3 506883
17.5%
2 475626
16.4%
1 243866
 
8.4%
0 212854
 
7.3%
9 194666
 
6.7%
8 173529
 
6.0%
7 150795
 
5.2%
5 148418
 
5.1%
6 144375
 
5.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2896280
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4 645268
22.3%
3 506883
17.5%
2 475626
16.4%
1 243866
 
8.4%
0 212854
 
7.3%
9 194666
 
6.7%
8 173529
 
6.0%
7 150795
 
5.2%
5 148418
 
5.1%
6 144375
 
5.0%

license
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.2 MiB
2025-01-14T10:45:21.647987image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters2027396
Distinct characters5
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCC0 1.0
2nd rowCC0 1.0
3rd rowCC0 1.0
4th rowCC0 1.0
5th rowCC0 1.0
ValueCountFrequency (%)
cc0 289628
50.0%
1.0 289628
50.0%
2025-01-14T10:45:21.750910image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
C 579256
28.6%
0 579256
28.6%
289628
14.3%
1 289628
14.3%
. 289628
14.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 868884
42.9%
Uppercase Letter 579256
28.6%
Space Separator 289628
 
14.3%
Other Punctuation 289628
 
14.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 579256
66.7%
1 289628
33.3%
Uppercase Letter
ValueCountFrequency (%)
C 579256
100.0%
Space Separator
ValueCountFrequency (%)
289628
100.0%
Other Punctuation
ValueCountFrequency (%)
. 289628
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1448140
71.4%
Latin 579256
 
28.6%

Most frequent character per script

Common
ValueCountFrequency (%)
0 579256
40.0%
289628
20.0%
1 289628
20.0%
. 289628
20.0%
Latin
ValueCountFrequency (%)
C 579256
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2027396
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
C 579256
28.6%
0 579256
28.6%
289628
14.3%
1 289628
14.3%
. 289628
14.3%
Distinct1169
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size2.2 MiB
2025-01-14T10:45:21.977241image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters2896280
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique229 ?
Unique (%)0.1%

Sample

1st row2015/06/05
2nd row2023/05/16
3rd row2015/09/02
4th row2017/07/01
5th row2015/05/23
ValueCountFrequency (%)
2017/06/30 47834
16.5%
2023/05/16 41000
14.2%
2017/07/01 26280
 
9.1%
2015/05/23 17611
 
6.1%
2015/07/03 13223
 
4.6%
2015/05/18 11421
 
3.9%
2015/07/01 10549
 
3.6%
2015/06/24 9657
 
3.3%
2015/07/02 9646
 
3.3%
2015/06/23 9602
 
3.3%
Other values (1159) 92805
32.0%
2025-01-14T10:45:22.282570image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 730395
25.2%
/ 579256
20.0%
2 487219
16.8%
1 369028
12.7%
5 235720
 
8.1%
3 146696
 
5.1%
6 141889
 
4.9%
7 139337
 
4.8%
8 26564
 
0.9%
9 21312
 
0.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2317024
80.0%
Other Punctuation 579256
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 730395
31.5%
2 487219
21.0%
1 369028
15.9%
5 235720
 
10.2%
3 146696
 
6.3%
6 141889
 
6.1%
7 139337
 
6.0%
8 26564
 
1.1%
9 21312
 
0.9%
4 18864
 
0.8%
Other Punctuation
ValueCountFrequency (%)
/ 579256
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2896280
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 730395
25.2%
/ 579256
20.0%
2 487219
16.8%
1 369028
12.7%
5 235720
 
8.1%
3 146696
 
5.1%
6 141889
 
4.9%
7 139337
 
4.8%
8 26564
 
0.9%
9 21312
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2896280
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 730395
25.2%
/ 579256
20.0%
2 487219
16.8%
1 369028
12.7%
5 235720
 
8.1%
3 146696
 
5.1%
6 141889
 
4.9%
7 139337
 
4.8%
8 26564
 
0.9%
9 21312
 
0.7%

rightsHolder
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.2 MiB
2025-01-14T10:45:22.349967image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length29
Median length29
Mean length29
Min length29

Characters and Unicode

Total characters8399212
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNaturalis Biodiversity Center
2nd rowNaturalis Biodiversity Center
3rd rowNaturalis Biodiversity Center
4th rowNaturalis Biodiversity Center
5th rowNaturalis Biodiversity Center
ValueCountFrequency (%)
naturalis 289628
33.3%
biodiversity 289628
33.3%
center 289628
33.3%
2025-01-14T10:45:22.457548image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 1158512
13.8%
t 868884
10.3%
r 868884
10.3%
e 868884
10.3%
579256
 
6.9%
s 579256
 
6.9%
a 579256
 
6.9%
d 289628
 
3.4%
C 289628
 
3.4%
y 289628
 
3.4%
Other values (7) 2027396
24.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6951072
82.8%
Uppercase Letter 868884
 
10.3%
Space Separator 579256
 
6.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 1158512
16.7%
t 868884
12.5%
r 868884
12.5%
e 868884
12.5%
s 579256
8.3%
a 579256
8.3%
d 289628
 
4.2%
y 289628
 
4.2%
v 289628
 
4.2%
o 289628
 
4.2%
Other values (3) 868884
12.5%
Uppercase Letter
ValueCountFrequency (%)
C 289628
33.3%
N 289628
33.3%
B 289628
33.3%
Space Separator
ValueCountFrequency (%)
579256
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7819956
93.1%
Common 579256
 
6.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 1158512
14.8%
t 868884
11.1%
r 868884
11.1%
e 868884
11.1%
s 579256
 
7.4%
a 579256
 
7.4%
d 289628
 
3.7%
C 289628
 
3.7%
y 289628
 
3.7%
v 289628
 
3.7%
Other values (6) 1737768
22.2%
Common
ValueCountFrequency (%)
579256
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8399212
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 1158512
13.8%
t 868884
10.3%
r 868884
10.3%
e 868884
10.3%
579256
 
6.9%
s 579256
 
6.9%
a 579256
 
6.9%
d 289628
 
3.4%
C 289628
 
3.4%
y 289628
 
3.4%
Other values (7) 2027396
24.1%

institutionID
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.2 MiB
2025-01-14T10:45:22.519767image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length25
Median length25
Mean length25
Min length25

Characters and Unicode

Total characters7240700
Distinct characters16
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowhttps://ror.org/0566bfb96
2nd rowhttps://ror.org/0566bfb96
3rd rowhttps://ror.org/0566bfb96
4th rowhttps://ror.org/0566bfb96
5th rowhttps://ror.org/0566bfb96
ValueCountFrequency (%)
https://ror.org/0566bfb96 289628
100.0%
2025-01-14T10:45:22.626621image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/ 868884
12.0%
r 868884
12.0%
6 868884
12.0%
t 579256
 
8.0%
o 579256
 
8.0%
b 579256
 
8.0%
h 289628
 
4.0%
p 289628
 
4.0%
s 289628
 
4.0%
: 289628
 
4.0%
Other values (6) 1737768
24.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4054792
56.0%
Decimal Number 1737768
24.0%
Other Punctuation 1448140
 
20.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 868884
21.4%
t 579256
14.3%
o 579256
14.3%
b 579256
14.3%
h 289628
 
7.1%
p 289628
 
7.1%
s 289628
 
7.1%
g 289628
 
7.1%
f 289628
 
7.1%
Decimal Number
ValueCountFrequency (%)
6 868884
50.0%
0 289628
 
16.7%
5 289628
 
16.7%
9 289628
 
16.7%
Other Punctuation
ValueCountFrequency (%)
/ 868884
60.0%
: 289628
 
20.0%
. 289628
 
20.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4054792
56.0%
Common 3185908
44.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 868884
21.4%
t 579256
14.3%
o 579256
14.3%
b 579256
14.3%
h 289628
 
7.1%
p 289628
 
7.1%
s 289628
 
7.1%
g 289628
 
7.1%
f 289628
 
7.1%
Common
ValueCountFrequency (%)
/ 868884
27.3%
6 868884
27.3%
: 289628
 
9.1%
. 289628
 
9.1%
0 289628
 
9.1%
5 289628
 
9.1%
9 289628
 
9.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7240700
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 868884
12.0%
r 868884
12.0%
6 868884
12.0%
t 579256
 
8.0%
o 579256
 
8.0%
b 579256
 
8.0%
h 289628
 
4.0%
p 289628
 
4.0%
s 289628
 
4.0%
: 289628
 
4.0%
Other values (6) 1737768
24.0%

collectionCode
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.2 MiB
2025-01-14T10:45:22.670072image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters1158512
Distinct characters4
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAves
2nd rowAves
3rd rowAves
4th rowAves
5th rowAves
ValueCountFrequency (%)
aves 289628
100.0%
2025-01-14T10:45:22.768967image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 289628
25.0%
v 289628
25.0%
e 289628
25.0%
s 289628
25.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 868884
75.0%
Uppercase Letter 289628
 
25.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
v 289628
33.3%
e 289628
33.3%
s 289628
33.3%
Uppercase Letter
ValueCountFrequency (%)
A 289628
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1158512
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 289628
25.0%
v 289628
25.0%
e 289628
25.0%
s 289628
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1158512
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 289628
25.0%
v 289628
25.0%
e 289628
25.0%
s 289628
25.0%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.2 MiB
2025-01-14T10:45:22.821693image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length17
Median length17
Mean length16.99979284
Min length13

Characters and Unicode

Total characters4923616
Distinct characters15
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPreservedSpecimen
2nd rowPreservedSpecimen
3rd rowPreservedSpecimen
4th rowPreservedSpecimen
5th rowPreservedSpecimen
ValueCountFrequency (%)
preservedspecimen 289613
> 99.9%
otherspecimen 15
 
< 0.1%
2025-01-14T10:45:22.944162image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 1448110
29.4%
r 579241
 
11.8%
S 289628
 
5.9%
p 289628
 
5.9%
c 289628
 
5.9%
i 289628
 
5.9%
m 289628
 
5.9%
n 289628
 
5.9%
P 289613
 
5.9%
s 289613
 
5.9%
Other values (5) 579271
11.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4344360
88.2%
Uppercase Letter 579256
 
11.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 1448110
33.3%
r 579241
 
13.3%
p 289628
 
6.7%
c 289628
 
6.7%
i 289628
 
6.7%
m 289628
 
6.7%
n 289628
 
6.7%
s 289613
 
6.7%
v 289613
 
6.7%
d 289613
 
6.7%
Other values (2) 30
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
S 289628
50.0%
P 289613
50.0%
O 15
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 4923616
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 1448110
29.4%
r 579241
 
11.8%
S 289628
 
5.9%
p 289628
 
5.9%
c 289628
 
5.9%
i 289628
 
5.9%
m 289628
 
5.9%
n 289628
 
5.9%
P 289613
 
5.9%
s 289613
 
5.9%
Other values (5) 579271
11.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4923616
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 1448110
29.4%
r 579241
 
11.8%
S 289628
 
5.9%
p 289628
 
5.9%
c 289628
 
5.9%
i 289628
 
5.9%
m 289628
 
5.9%
n 289628
 
5.9%
P 289613
 
5.9%
s 289613
 
5.9%
Other values (5) 579271
11.8%

occurrenceID
Text

Unique 

Distinct289628
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size2.2 MiB
2025-01-14T10:45:23.171481image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length77
Median length71
Mean length67.19895521
Min length62

Characters and Unicode

Total characters19462699
Distinct characters44
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique289628 ?
Unique (%)100.0%

Sample

1st rowhttps://data.biodiversitydata.nl/naturalis/specimen/ZMA.AVES.2
2nd rowhttps://data.biodiversitydata.nl/naturalis/specimen/RMNH.AVES.4
3rd rowhttps://data.biodiversitydata.nl/naturalis/specimen/ZMA.AVES.18
4th rowhttps://data.biodiversitydata.nl/naturalis/specimen/ZMA.AVES.27
5th rowhttps://data.biodiversitydata.nl/naturalis/specimen/ZMA.AVES.36
ValueCountFrequency (%)
https://data.biodiversitydata.nl/naturalis/specimen/zma.aves.2 1
 
< 0.1%
https://data.biodiversitydata.nl/naturalis/specimen/rmnh.5069738 1
 
< 0.1%
https://data.biodiversitydata.nl/naturalis/specimen/zma.aves.45 1
 
< 0.1%
https://data.biodiversitydata.nl/naturalis/specimen/zma.aves.54 1
 
< 0.1%
https://data.biodiversitydata.nl/naturalis/specimen/zma.aves.72 1
 
< 0.1%
https://data.biodiversitydata.nl/naturalis/specimen/zma.aves.222 1
 
< 0.1%
https://data.biodiversitydata.nl/naturalis/specimen/zma.aves.81 1
 
< 0.1%
https://data.biodiversitydata.nl/naturalis/specimen/rmnh.5069558 1
 
< 0.1%
https://data.biodiversitydata.nl/naturalis/specimen/rmnh.5069792 1
 
< 0.1%
https://data.biodiversitydata.nl/naturalis/specimen/zma.aves.27 1
 
< 0.1%
Other values (289618) 289618
> 99.9%
2025-01-14T10:45:23.466542image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 1740739
 
8.9%
t 1737768
 
8.9%
/ 1448140
 
7.4%
i 1448140
 
7.4%
. 1166174
 
6.0%
s 1158512
 
6.0%
d 868963
 
4.5%
e 868894
 
4.5%
n 868884
 
4.5%
l 579256
 
3.0%
Other values (34) 7577229
38.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 12750862
65.5%
Other Punctuation 2903942
 
14.9%
Uppercase Letter 2247345
 
11.5%
Decimal Number 1560550
 
8.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1740739
13.7%
t 1737768
13.6%
i 1448140
11.4%
s 1158512
9.1%
d 868963
 
6.8%
e 868894
 
6.8%
n 868884
 
6.8%
l 579256
 
4.5%
p 579256
 
4.5%
r 579256
 
4.5%
Other values (9) 2321194
18.2%
Uppercase Letter
ValueCountFrequency (%)
A 352650
15.7%
M 289627
12.9%
E 287857
12.8%
S 287856
12.8%
V 287856
12.8%
R 224833
10.0%
N 224833
10.0%
H 224833
10.0%
Z 64794
 
2.9%
P 2204
 
0.1%
Other values (2) 2
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 228550
14.6%
2 202110
13.0%
5 154374
9.9%
3 152317
9.8%
4 146302
9.4%
6 139539
8.9%
0 137047
8.8%
7 135271
8.7%
8 133717
8.6%
9 131323
8.4%
Other Punctuation
ValueCountFrequency (%)
/ 1448140
49.9%
. 1166174
40.2%
: 289628
 
10.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 14998207
77.1%
Common 4464492
 
22.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1740739
 
11.6%
t 1737768
 
11.6%
i 1448140
 
9.7%
s 1158512
 
7.7%
d 868963
 
5.8%
e 868894
 
5.8%
n 868884
 
5.8%
l 579256
 
3.9%
p 579256
 
3.9%
r 579256
 
3.9%
Other values (21) 4568539
30.5%
Common
ValueCountFrequency (%)
/ 1448140
32.4%
. 1166174
26.1%
: 289628
 
6.5%
1 228550
 
5.1%
2 202110
 
4.5%
5 154374
 
3.5%
3 152317
 
3.4%
4 146302
 
3.3%
6 139539
 
3.1%
0 137047
 
3.1%
Other values (3) 400311
 
9.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 19462699
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 1740739
 
8.9%
t 1737768
 
8.9%
/ 1448140
 
7.4%
i 1448140
 
7.4%
. 1166174
 
6.0%
s 1158512
 
6.0%
d 868963
 
4.5%
e 868894
 
4.5%
n 868884
 
4.5%
l 579256
 
3.0%
Other values (34) 7577229
38.9%

catalogNumber
Text

Unique 

Distinct289628
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size2.2 MiB
2025-01-14T10:45:23.727777image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length25
Median length19
Mean length15.19895521
Min length10

Characters and Unicode

Total characters4402043
Distinct characters31
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique289628 ?
Unique (%)100.0%

Sample

1st rowZMA.AVES.2
2nd rowRMNH.AVES.4
3rd rowZMA.AVES.18
4th rowZMA.AVES.27
5th rowZMA.AVES.36
ValueCountFrequency (%)
zma.aves.2 1
 
< 0.1%
rmnh.5069738 1
 
< 0.1%
zma.aves.45 1
 
< 0.1%
zma.aves.54 1
 
< 0.1%
zma.aves.72 1
 
< 0.1%
zma.aves.222 1
 
< 0.1%
zma.aves.81 1
 
< 0.1%
rmnh.5069558 1
 
< 0.1%
rmnh.5069792 1
 
< 0.1%
zma.aves.27 1
 
< 0.1%
Other values (289618) 289618
> 99.9%
2025-01-14T10:45:24.061384image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 586918
13.3%
A 352650
 
8.0%
M 289627
 
6.6%
E 287857
 
6.5%
V 287856
 
6.5%
S 287856
 
6.5%
1 228550
 
5.2%
N 224833
 
5.1%
R 224833
 
5.1%
H 224833
 
5.1%
Other values (21) 1406230
31.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 2247345
51.1%
Decimal Number 1560550
35.5%
Other Punctuation 586918
 
13.3%
Lowercase Letter 7230
 
0.2%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 352650
15.7%
M 289627
12.9%
E 287857
12.8%
V 287856
12.8%
S 287856
12.8%
N 224833
10.0%
R 224833
10.0%
H 224833
10.0%
Z 64794
 
2.9%
P 2204
 
0.1%
Other values (2) 2
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 228550
14.6%
2 202110
13.0%
5 154374
9.9%
3 152317
9.8%
4 146302
9.4%
6 139539
8.9%
0 137047
8.8%
7 135271
8.7%
8 133717
8.6%
9 131323
8.4%
Lowercase Letter
ValueCountFrequency (%)
b 2993
41.4%
a 2971
41.1%
c 1060
 
14.7%
x 106
 
1.5%
d 79
 
1.1%
y 10
 
0.1%
e 10
 
0.1%
v 1
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
. 586918
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2254575
51.2%
Common 2147468
48.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 352650
15.6%
M 289627
12.8%
E 287857
12.8%
V 287856
12.8%
S 287856
12.8%
N 224833
10.0%
R 224833
10.0%
H 224833
10.0%
Z 64794
 
2.9%
b 2993
 
0.1%
Other values (10) 6443
 
0.3%
Common
ValueCountFrequency (%)
. 586918
27.3%
1 228550
 
10.6%
2 202110
 
9.4%
5 154374
 
7.2%
3 152317
 
7.1%
4 146302
 
6.8%
6 139539
 
6.5%
0 137047
 
6.4%
7 135271
 
6.3%
8 133717
 
6.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4402043
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 586918
13.3%
A 352650
 
8.0%
M 289627
 
6.6%
E 287857
 
6.5%
V 287856
 
6.5%
S 287856
 
6.5%
1 228550
 
5.2%
N 224833
 
5.1%
R 224833
 
5.1%
H 224833
 
5.1%
Other values (21) 1406230
31.9%

recordNumber
Text

Missing 

Distinct5837
Distinct (%)43.9%
Missing276338
Missing (%)95.4%
Memory size2.2 MiB
2025-01-14T10:45:24.259577image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length25
Median length22
Mean length4.631226486
Min length1

Characters and Unicode

Total characters61549
Distinct characters73
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4106 ?
Unique (%)30.9%

Sample

1st row1.3
2nd row4.3
3rd row6.4
4th row15
5th row175
ValueCountFrequency (%)
no 3016
 
17.2%
reg 601
 
3.4%
reg.no 175
 
1.0%
n 85
 
0.5%
verz 57
 
0.3%
coll.-no 49
 
0.3%
2 47
 
0.3%
3 41
 
0.2%
1 41
 
0.2%
6 34
 
0.2%
Other values (4160) 13389
76.4%
2025-01-14T10:45:24.517837image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 7134
11.6%
4 4703
 
7.6%
3 4671
 
7.6%
2 4607
 
7.5%
4247
 
6.9%
. 4085
 
6.6%
5 3931
 
6.4%
6 3619
 
5.9%
7 3512
 
5.7%
o 3431
 
5.6%
Other values (63) 17609
28.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 41965
68.2%
Lowercase Letter 6638
 
10.8%
Other Punctuation 4273
 
6.9%
Space Separator 4247
 
6.9%
Uppercase Letter 4115
 
6.7%
Close Punctuation 103
 
0.2%
Open Punctuation 103
 
0.2%
Dash Punctuation 81
 
0.1%
Math Symbol 24
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 3431
51.7%
e 965
 
14.5%
g 815
 
12.3%
n 422
 
6.4%
r 223
 
3.4%
l 215
 
3.2%
v 79
 
1.2%
a 76
 
1.1%
z 73
 
1.1%
c 65
 
1.0%
Other values (14) 274
 
4.1%
Uppercase Letter
ValueCountFrequency (%)
N 3030
73.6%
R 739
 
18.0%
C 140
 
3.4%
I 65
 
1.6%
X 32
 
0.8%
V 16
 
0.4%
A 15
 
0.4%
G 13
 
0.3%
B 12
 
0.3%
L 12
 
0.3%
Other values (13) 41
 
1.0%
Decimal Number
ValueCountFrequency (%)
1 7134
17.0%
4 4703
11.2%
3 4671
11.1%
2 4607
11.0%
5 3931
9.4%
6 3619
8.6%
7 3512
8.4%
8 3321
7.9%
0 3240
7.7%
9 3227
7.7%
Other Punctuation
ValueCountFrequency (%)
. 4085
95.6%
: 105
 
2.5%
' 30
 
0.7%
, 16
 
0.4%
/ 16
 
0.4%
? 15
 
0.4%
; 4
 
0.1%
& 1
 
< 0.1%
1
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 101
98.1%
] 2
 
1.9%
Open Punctuation
ValueCountFrequency (%)
( 101
98.1%
[ 2
 
1.9%
Space Separator
ValueCountFrequency (%)
4247
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 81
100.0%
Math Symbol
ValueCountFrequency (%)
= 24
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 50796
82.5%
Latin 10753
 
17.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 3431
31.9%
N 3030
28.2%
e 965
 
9.0%
g 815
 
7.6%
R 739
 
6.9%
n 422
 
3.9%
r 223
 
2.1%
l 215
 
2.0%
C 140
 
1.3%
v 79
 
0.7%
Other values (37) 694
 
6.5%
Common
ValueCountFrequency (%)
1 7134
14.0%
4 4703
9.3%
3 4671
9.2%
2 4607
9.1%
4247
8.4%
. 4085
8.0%
5 3931
7.7%
6 3619
7.1%
7 3512
6.9%
8 3321
6.5%
Other values (16) 6966
13.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 61548
> 99.9%
Punctuation 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 7134
11.6%
4 4703
 
7.6%
3 4671
 
7.6%
2 4607
 
7.5%
4247
 
6.9%
. 4085
 
6.6%
5 3931
 
6.4%
6 3619
 
5.9%
7 3512
 
5.7%
o 3431
 
5.6%
Other values (62) 17608
28.6%
Punctuation
ValueCountFrequency (%)
1
100.0%

recordedBy
Text

Missing 

Distinct11879
Distinct (%)6.0%
Missing92827
Missing (%)32.1%
Memory size2.2 MiB
2025-01-14T10:45:24.723641image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length252
Median length227
Mean length15.05751495
Min length2

Characters and Unicode

Total characters2963334
Distinct characters102
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6885 ?
Unique (%)3.5%

Sample

1st rowVan der Spruyt G.S.
2nd rowGroen J.
3rd rowPollen&vDam cf Apr'63-Jun'66
4th rowPloos van Amstel D.
5th rowEbels E.
ValueCountFrequency (%)
van 28340
 
5.3%
not 14646
 
2.7%
stated 13574
 
2.5%
12974
 
2.4%
bartels 11506
 
2.2%
j 10745
 
2.0%
de 10419
 
2.0%
heurn 8672
 
1.6%
m.e.g 8315
 
1.6%
f 7204
 
1.4%
Other values (8570) 406910
76.3%
2025-01-14T10:45:25.013319image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 346161
 
11.7%
338325
 
11.4%
e 266038
 
9.0%
n 166306
 
5.6%
a 146851
 
5.0%
r 141966
 
4.8%
o 124914
 
4.2%
t 117243
 
4.0%
s 116206
 
3.9%
l 82761
 
2.8%
Other values (92) 1116563
37.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1630883
55.0%
Uppercase Letter 607392
 
20.5%
Other Punctuation 375251
 
12.7%
Space Separator 338325
 
11.4%
Decimal Number 4048
 
0.1%
Open Punctuation 2717
 
0.1%
Close Punctuation 2714
 
0.1%
Dash Punctuation 1950
 
0.1%
Math Symbol 53
 
< 0.1%
Connector Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 266038
16.3%
n 166306
10.2%
a 146851
9.0%
r 141966
8.7%
o 124914
 
7.7%
t 117243
 
7.2%
s 116206
 
7.1%
l 82761
 
5.1%
i 72542
 
4.4%
d 62657
 
3.8%
Other values (34) 333399
20.4%
Uppercase Letter
ValueCountFrequency (%)
H 61610
 
10.1%
J 50514
 
8.3%
B 47475
 
7.8%
A 40470
 
6.7%
M 36296
 
6.0%
C 35111
 
5.8%
G 34471
 
5.7%
F 30888
 
5.1%
P 30040
 
4.9%
S 26970
 
4.4%
Other values (17) 213547
35.2%
Other Punctuation
ValueCountFrequency (%)
. 346161
92.2%
& 12702
 
3.4%
: 6268
 
1.7%
; 5125
 
1.4%
/ 1659
 
0.4%
\ 1596
 
0.4%
' 996
 
0.3%
? 375
 
0.1%
" 294
 
0.1%
! 60
 
< 0.1%
Other values (2) 15
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 1074
26.5%
9 747
18.5%
0 629
15.5%
6 456
11.3%
2 349
 
8.6%
3 311
 
7.7%
8 215
 
5.3%
4 133
 
3.3%
7 76
 
1.9%
5 58
 
1.4%
Math Symbol
ValueCountFrequency (%)
= 38
71.7%
> 7
 
13.2%
+ 7
 
13.2%
| 1
 
1.9%
Space Separator
ValueCountFrequency (%)
338325
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2717
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2714
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1950
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2238275
75.5%
Common 725059
 
24.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 266038
 
11.9%
n 166306
 
7.4%
a 146851
 
6.6%
r 141966
 
6.3%
o 124914
 
5.6%
t 117243
 
5.2%
s 116206
 
5.2%
l 82761
 
3.7%
i 72542
 
3.2%
d 62657
 
2.8%
Other values (61) 940791
42.0%
Common
ValueCountFrequency (%)
. 346161
47.7%
338325
46.7%
& 12702
 
1.8%
: 6268
 
0.9%
; 5125
 
0.7%
( 2717
 
0.4%
) 2714
 
0.4%
- 1950
 
0.3%
/ 1659
 
0.2%
\ 1596
 
0.2%
Other values (21) 5842
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2955585
99.7%
None 7737
 
0.3%
Punctuation 12
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 346161
 
11.7%
338325
 
11.4%
e 266038
 
9.0%
n 166306
 
5.6%
a 146851
 
5.0%
r 141966
 
4.8%
o 124914
 
4.2%
t 117243
 
4.0%
s 116206
 
3.9%
l 82761
 
2.8%
Other values (72) 1108814
37.5%
None
ValueCountFrequency (%)
ü 5118
66.1%
é 1007
 
13.0%
ä 838
 
10.8%
ö 417
 
5.4%
ñ 143
 
1.8%
ø 118
 
1.5%
ë 34
 
0.4%
è 20
 
0.3%
ó 15
 
0.2%
û 8
 
0.1%
Other values (9) 19
 
0.2%
Punctuation
ValueCountFrequency (%)
12
100.0%

individualCount
Text

Missing 

Distinct54
Distinct (%)< 0.1%
Missing30538
Missing (%)10.5%
Memory size2.2 MiB
2025-01-14T10:45:25.082600image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length1
Mean length1.003743873
Min length1

Characters and Unicode

Total characters260060
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8 ?
Unique (%)< 0.1%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1
ValueCountFrequency (%)
1 227379
87.8%
2 11832
 
4.6%
3 6214
 
2.4%
4 5617
 
2.2%
5 3939
 
1.5%
6 1721
 
0.7%
7 695
 
0.3%
8 426
 
0.2%
9 305
 
0.1%
10 260
 
0.1%
Other values (44) 702
 
0.3%
2025-01-14T10:45:25.202970image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 228230
87.8%
2 12051
 
4.6%
3 6372
 
2.5%
4 5687
 
2.2%
5 4035
 
1.6%
6 1786
 
0.7%
7 749
 
0.3%
8 468
 
0.2%
9 372
 
0.1%
0 310
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 260060
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 228230
87.8%
2 12051
 
4.6%
3 6372
 
2.5%
4 5687
 
2.2%
5 4035
 
1.6%
6 1786
 
0.7%
7 749
 
0.3%
8 468
 
0.2%
9 372
 
0.1%
0 310
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 260060
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 228230
87.8%
2 12051
 
4.6%
3 6372
 
2.5%
4 5687
 
2.2%
5 4035
 
1.6%
6 1786
 
0.7%
7 749
 
0.3%
8 468
 
0.2%
9 372
 
0.1%
0 310
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260060
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 228230
87.8%
2 12051
 
4.6%
3 6372
 
2.5%
4 5687
 
2.2%
5 4035
 
1.6%
6 1786
 
0.7%
7 749
 
0.3%
8 468
 
0.2%
9 372
 
0.1%
0 310
 
0.1%

sex
Text

Missing 

Distinct2
Distinct (%)< 0.1%
Missing98166
Missing (%)33.9%
Memory size2.2 MiB
2025-01-14T10:45:25.251835image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length6
Median length4
Mean length4.830525117
Min length4

Characters and Unicode

Total characters924862
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowfemale
2nd rowfemale
3rd rowmale
4th rowmale
5th rowfemale
ValueCountFrequency (%)
male 111955
58.5%
female 79507
41.5%
2025-01-14T10:45:25.366307image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 270969
29.3%
m 191462
20.7%
a 191462
20.7%
l 191462
20.7%
f 79507
 
8.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 924862
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 270969
29.3%
m 191462
20.7%
a 191462
20.7%
l 191462
20.7%
f 79507
 
8.6%

Most occurring scripts

ValueCountFrequency (%)
Latin 924862
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 270969
29.3%
m 191462
20.7%
a 191462
20.7%
l 191462
20.7%
f 79507
 
8.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 924862
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 270969
29.3%
m 191462
20.7%
a 191462
20.7%
l 191462
20.7%
f 79507
 
8.6%

lifeStage
Text

Missing 

Distinct96
Distinct (%)0.1%
Missing206842
Missing (%)71.4%
Memory size2.2 MiB
2025-01-14T10:45:25.423230image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length31
Median length3
Mean length4.659568043
Min length1

Characters and Unicode

Total characters385747
Distinct characters51
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique36 ?
Unique (%)< 0.1%

Sample

1st rowegg
2nd rowadult
3rd rowadult
4th rowimmature
5th rowjuvenile
ValueCountFrequency (%)
egg 41586
48.9%
adult 20714
24.3%
juvenile 13193
 
15.5%
pullus 3277
 
3.8%
c.y 1836
 
2.2%
immature 1548
 
1.8%
1st 1425
 
1.7%
2nd 563
 
0.7%
year 191
 
0.2%
kj 158
 
0.2%
Other values (74) 636
 
0.7%
2025-01-14T10:45:25.547849image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
g 83191
21.6%
e 70025
18.2%
u 42285
11.0%
l 40628
10.5%
t 23962
 
6.2%
a 22643
 
5.9%
d 21535
 
5.6%
i 14890
 
3.9%
n 13852
 
3.6%
j 13307
 
3.4%
Other values (41) 39429
10.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 376936
97.7%
Other Punctuation 3800
 
1.0%
Decimal Number 2345
 
0.6%
Space Separator 2341
 
0.6%
Uppercase Letter 216
 
0.1%
Dash Punctuation 54
 
< 0.1%
Math Symbol 40
 
< 0.1%
Close Punctuation 7
 
< 0.1%
Open Punctuation 7
 
< 0.1%
Other Number 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
g 83191
22.1%
e 70025
18.6%
u 42285
11.2%
l 40628
10.8%
t 23962
 
6.4%
a 22643
 
6.0%
d 21535
 
5.7%
i 14890
 
4.0%
n 13852
 
3.7%
j 13307
 
3.5%
Other values (14) 30618
 
8.1%
Decimal Number
ValueCountFrequency (%)
1 1552
66.2%
2 636
27.1%
3 96
 
4.1%
4 18
 
0.8%
5 14
 
0.6%
6 9
 
0.4%
9 9
 
0.4%
7 5
 
0.2%
8 5
 
0.2%
0 1
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
K 152
70.4%
J 53
 
24.5%
A 5
 
2.3%
S 3
 
1.4%
I 2
 
0.9%
W 1
 
0.5%
Other Punctuation
ValueCountFrequency (%)
. 3779
99.4%
, 12
 
0.3%
? 8
 
0.2%
/ 1
 
< 0.1%
Math Symbol
ValueCountFrequency (%)
> 38
95.0%
± 2
 
5.0%
Space Separator
ValueCountFrequency (%)
2341
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 54
100.0%
Close Punctuation
ValueCountFrequency (%)
] 7
100.0%
Open Punctuation
ValueCountFrequency (%)
[ 7
100.0%
Other Number
ValueCountFrequency (%)
¼ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 377152
97.8%
Common 8595
 
2.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
g 83191
22.1%
e 70025
18.6%
u 42285
11.2%
l 40628
10.8%
t 23962
 
6.4%
a 22643
 
6.0%
d 21535
 
5.7%
i 14890
 
3.9%
n 13852
 
3.7%
j 13307
 
3.5%
Other values (20) 30834
 
8.2%
Common
ValueCountFrequency (%)
. 3779
44.0%
2341
27.2%
1 1552
18.1%
2 636
 
7.4%
3 96
 
1.1%
- 54
 
0.6%
> 38
 
0.4%
4 18
 
0.2%
5 14
 
0.2%
, 12
 
0.1%
Other values (11) 55
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 385743
> 99.9%
None 4
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
g 83191
21.6%
e 70025
18.2%
u 42285
11.0%
l 40628
10.5%
t 23962
 
6.2%
a 22643
 
5.9%
d 21535
 
5.6%
i 14890
 
3.9%
n 13852
 
3.6%
j 13307
 
3.4%
Other values (38) 39425
10.2%
None
ValueCountFrequency (%)
± 2
50.0%
¼ 1
25.0%
é 1
25.0%
Distinct132
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.2 MiB
2025-01-14T10:45:25.614965image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length39
Median length37
Mean length16.94113829
Min length3

Characters and Unicode

Total characters4906628
Distinct characters44
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique45 ?
Unique (%)< 0.1%

Sample

1st rowskin (mounted skin)
2nd rowegg (air dried)
3rd rowskin (study skin)
4th rowskin (mounted skin)
5th rowskin (study skin)
ValueCountFrequency (%)
skin 380315
44.5%
air 114349
 
13.4%
dried 114349
 
13.4%
study 108297
 
12.7%
mounted 47294
 
5.5%
egg 41587
 
4.9%
skeletonized 7000
 
0.8%
skeleton 5297
 
0.6%
nest 4724
 
0.6%
whole 4690
 
0.5%
Other values (57) 27515
 
3.2%
2025-01-14T10:45:25.764671image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 631267
12.9%
565789
11.5%
s 520452
10.6%
n 453263
9.2%
k 396125
8.1%
d 393930
8.0%
) 289431
 
5.9%
( 289431
 
5.9%
e 260269
 
5.3%
r 234946
 
4.8%
Other values (34) 871725
17.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3753729
76.5%
Space Separator 565789
 
11.5%
Close Punctuation 289431
 
5.9%
Open Punctuation 289431
 
5.9%
Uppercase Letter 6292
 
0.1%
Decimal Number 1128
 
< 0.1%
Other Punctuation 601
 
< 0.1%
Math Symbol 217
 
< 0.1%
Dash Punctuation 8
 
< 0.1%
Modifier Symbol 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 631267
16.8%
s 520452
13.9%
n 453263
12.1%
k 396125
10.6%
d 393930
10.5%
e 260269
6.9%
r 234946
 
6.3%
t 179378
 
4.8%
u 160328
 
4.3%
a 122201
 
3.3%
Other values (13) 401570
10.7%
Uppercase Letter
ValueCountFrequency (%)
W 5239
83.3%
O 580
 
9.2%
H 322
 
5.1%
B 88
 
1.4%
L 34
 
0.5%
D 8
 
0.1%
N 8
 
0.1%
A 8
 
0.1%
T 5
 
0.1%
Decimal Number
ValueCountFrequency (%)
9 336
29.8%
6 336
29.8%
7 228
20.2%
0 228
20.2%
Other Punctuation
ValueCountFrequency (%)
% 564
93.8%
& 37
 
6.2%
Space Separator
ValueCountFrequency (%)
565789
100.0%
Close Punctuation
ValueCountFrequency (%)
) 289431
100.0%
Open Punctuation
ValueCountFrequency (%)
( 289431
100.0%
Math Symbol
ValueCountFrequency (%)
> 217
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 8
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3760021
76.6%
Common 1146607
 
23.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 631267
16.8%
s 520452
13.8%
n 453263
12.1%
k 396125
10.5%
d 393930
10.5%
e 260269
6.9%
r 234946
 
6.2%
t 179378
 
4.8%
u 160328
 
4.3%
a 122201
 
3.3%
Other values (22) 407862
10.8%
Common
ValueCountFrequency (%)
565789
49.3%
) 289431
25.2%
( 289431
25.2%
% 564
 
< 0.1%
9 336
 
< 0.1%
6 336
 
< 0.1%
7 228
 
< 0.1%
0 228
 
< 0.1%
> 217
 
< 0.1%
& 37
 
< 0.1%
Other values (2) 10
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4906628
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 631267
12.9%
565789
11.5%
s 520452
10.6%
n 453263
9.2%
k 396125
8.1%
d 393930
8.0%
) 289431
 
5.9%
( 289431
 
5.9%
e 260269
 
5.3%
r 234946
 
4.8%
Other values (34) 871725
17.8%

associatedTaxa
Text

Constant  Missing 

Distinct1
Distinct (%)33.3%
Missing289625
Missing (%)> 99.9%
Memory size2.2 MiB
2025-01-14T10:45:25.826669image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length64
Median length64
Mean length64
Min length64

Characters and Unicode

Total characters192
Distinct characters20
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowhas parasite: Cirrophthirius cf. recurvirostrae | Quadraceps sp.
2nd rowhas parasite: Cirrophthirius cf. recurvirostrae | Quadraceps sp.
3rd rowhas parasite: Cirrophthirius cf. recurvirostrae | Quadraceps sp.
ValueCountFrequency (%)
has 3
12.5%
parasite 3
12.5%
cirrophthirius 3
12.5%
cf 3
12.5%
recurvirostrae 3
12.5%
3
12.5%
quadraceps 3
12.5%
sp 3
12.5%
2025-01-14T10:45:25.941353image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
r 27
14.1%
21
10.9%
s 18
9.4%
a 18
9.4%
i 15
 
7.8%
p 12
 
6.2%
e 12
 
6.2%
h 9
 
4.7%
t 9
 
4.7%
u 9
 
4.7%
Other values (10) 42
21.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 153
79.7%
Space Separator 21
 
10.9%
Other Punctuation 9
 
4.7%
Uppercase Letter 6
 
3.1%
Math Symbol 3
 
1.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 27
17.6%
s 18
11.8%
a 18
11.8%
i 15
9.8%
p 12
7.8%
e 12
7.8%
h 9
 
5.9%
t 9
 
5.9%
u 9
 
5.9%
c 9
 
5.9%
Other values (4) 15
9.8%
Other Punctuation
ValueCountFrequency (%)
. 6
66.7%
: 3
33.3%
Uppercase Letter
ValueCountFrequency (%)
Q 3
50.0%
C 3
50.0%
Space Separator
ValueCountFrequency (%)
21
100.0%
Math Symbol
ValueCountFrequency (%)
| 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 159
82.8%
Common 33
 
17.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 27
17.0%
s 18
11.3%
a 18
11.3%
i 15
9.4%
p 12
7.5%
e 12
7.5%
h 9
 
5.7%
t 9
 
5.7%
u 9
 
5.7%
c 9
 
5.7%
Other values (6) 21
13.2%
Common
ValueCountFrequency (%)
21
63.6%
. 6
 
18.2%
| 3
 
9.1%
: 3
 
9.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 192
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 27
14.1%
21
10.9%
s 18
9.4%
a 18
9.4%
i 15
 
7.8%
p 12
 
6.2%
e 12
 
6.2%
h 9
 
4.7%
t 9
 
4.7%
u 9
 
4.7%
Other values (10) 42
21.9%

eventDate
Text

Missing 

Distinct44808
Distinct (%)20.8%
Missing74040
Missing (%)25.6%
Memory size2.2 MiB
2025-01-14T10:45:26.063241image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length21
Median length10
Mean length11.37834202
Min length10

Characters and Unicode

Total characters2453034
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique12124 ?
Unique (%)5.6%

Sample

1st row1904-07-15
2nd row1887-11-19
3rd row2014-01-05
4th row2008-09-09
5th row2006-04-22
ValueCountFrequency (%)
1875-10-01/1875-10-31 571
 
0.3%
1901-01-01/1901-12-31 442
 
0.2%
1930-01-01/1951-12-31 384
 
0.2%
1912-01-01/1916-12-31 312
 
0.1%
1820-12-01/1821-09-30 310
 
0.1%
1862-01-01/1862-12-31 290
 
0.1%
1903-01-01/1908-12-31 283
 
0.1%
1868-01-01/1868-12-31 283
 
0.1%
1982-01-01/1982-12-31 260
 
0.1%
1861-01-01/1861-12-31 240
 
0.1%
Other values (44798) 212213
98.4%
2025-01-14T10:45:26.252156image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 532867
21.7%
- 485204
19.8%
0 370078
15.1%
9 253338
10.3%
2 178215
 
7.3%
8 134256
 
5.5%
3 112582
 
4.6%
6 104197
 
4.2%
5 96176
 
3.9%
7 81018
 
3.3%
Other values (2) 105103
 
4.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1940816
79.1%
Dash Punctuation 485204
 
19.8%
Other Punctuation 27014
 
1.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 532867
27.5%
0 370078
19.1%
9 253338
13.1%
2 178215
 
9.2%
8 134256
 
6.9%
3 112582
 
5.8%
6 104197
 
5.4%
5 96176
 
5.0%
7 81018
 
4.2%
4 78089
 
4.0%
Dash Punctuation
ValueCountFrequency (%)
- 485204
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 27014
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2453034
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 532867
21.7%
- 485204
19.8%
0 370078
15.1%
9 253338
10.3%
2 178215
 
7.3%
8 134256
 
5.5%
3 112582
 
4.6%
6 104197
 
4.2%
5 96176
 
3.9%
7 81018
 
3.3%
Other values (2) 105103
 
4.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2453034
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 532867
21.7%
- 485204
19.8%
0 370078
15.1%
9 253338
10.3%
2 178215
 
7.3%
8 134256
 
5.5%
3 112582
 
4.6%
6 104197
 
4.2%
5 96176
 
3.9%
7 81018
 
3.3%
Other values (2) 105103
 
4.3%

verbatimEventDate
Text

Missing 

Distinct75421
Distinct (%)32.8%
Missing59530
Missing (%)20.6%
Memory size2.2 MiB
2025-01-14T10:45:26.463227image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length255
Median length10
Mean length10.3775261
Min length1

Characters and Unicode

Total characters2387848
Distinct characters100
Distinct categories12 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique36236 ?
Unique (%)15.7%

Sample

1st row15/7/1904
2nd row19-11-1887
3rd rowbefore 1880
4th row5 januari 2014
5th row9 september 2008
ValueCountFrequency (%)
5950
 
2.0%
on 4818
 
1.6%
label 4338
 
1.5%
may 1985
 
0.7%
april 1642
 
0.6%
september 1503
 
0.5%
october 1244
 
0.4%
june 1238
 
0.4%
december 1221
 
0.4%
november 1151
 
0.4%
Other values (69551) 267833
91.4%
2025-01-14T10:45:26.744811image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 469802
19.7%
- 338404
14.2%
9 254561
10.7%
0 217025
9.1%
2 169904
 
7.1%
8 129201
 
5.4%
6 103192
 
4.3%
5 95718
 
4.0%
3 93888
 
3.9%
/ 82961
 
3.5%
Other values (90) 433192
18.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1693497
70.9%
Dash Punctuation 338405
 
14.2%
Lowercase Letter 164978
 
6.9%
Other Punctuation 99650
 
4.2%
Space Separator 64207
 
2.7%
Uppercase Letter 25942
 
1.1%
Math Symbol 629
 
< 0.1%
Open Punctuation 269
 
< 0.1%
Close Punctuation 267
 
< 0.1%
Modifier Symbol 2
 
< 0.1%
Other values (2) 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 25947
15.7%
a 16506
10.0%
r 15609
9.5%
l 15600
9.5%
b 12840
 
7.8%
n 11608
 
7.0%
u 9043
 
5.5%
o 8682
 
5.3%
t 7040
 
4.3%
i 6952
 
4.2%
Other values (26) 35151
21.3%
Uppercase Letter
ValueCountFrequency (%)
M 4257
16.4%
O 4234
16.3%
J 3813
14.7%
A 2864
11.0%
N 1992
7.7%
D 1848
7.1%
S 1721
6.6%
I 1097
 
4.2%
F 1078
 
4.2%
H 788
 
3.0%
Other values (16) 2250
8.7%
Other Punctuation
ValueCountFrequency (%)
/ 82961
83.3%
, 5609
 
5.6%
: 5189
 
5.2%
. 3973
 
4.0%
' 733
 
0.7%
\ 686
 
0.7%
? 373
 
0.4%
" 49
 
< 0.1%
; 34
 
< 0.1%
! 24
 
< 0.1%
Other values (3) 19
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 469802
27.7%
9 254561
15.0%
0 217025
12.8%
2 169904
 
10.0%
8 129201
 
7.6%
6 103192
 
6.1%
5 95718
 
5.7%
3 93888
 
5.5%
7 81597
 
4.8%
4 78609
 
4.6%
Math Symbol
ValueCountFrequency (%)
± 585
93.0%
> 16
 
2.5%
< 14
 
2.2%
+ 10
 
1.6%
= 4
 
0.6%
Dash Punctuation
ValueCountFrequency (%)
- 338404
> 99.9%
1
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 185
68.8%
[ 84
31.2%
Close Punctuation
ValueCountFrequency (%)
) 184
68.9%
] 83
31.1%
Space Separator
ValueCountFrequency (%)
64207
100.0%
Modifier Symbol
ValueCountFrequency (%)
´ 2
100.0%
Other Number
ValueCountFrequency (%)
½ 1
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2196928
92.0%
Latin 190920
 
8.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 25947
13.6%
a 16506
 
8.6%
r 15609
 
8.2%
l 15600
 
8.2%
b 12840
 
6.7%
n 11608
 
6.1%
u 9043
 
4.7%
o 8682
 
4.5%
t 7040
 
3.7%
i 6952
 
3.6%
Other values (52) 61093
32.0%
Common
ValueCountFrequency (%)
1 469802
21.4%
- 338404
15.4%
9 254561
11.6%
0 217025
9.9%
2 169904
 
7.7%
8 129201
 
5.9%
6 103192
 
4.7%
5 95718
 
4.4%
3 93888
 
4.3%
/ 82961
 
3.8%
Other values (28) 242272
11.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2387101
> 99.9%
None 739
 
< 0.1%
Punctuation 8
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 469802
19.7%
- 338404
14.2%
9 254561
10.7%
0 217025
9.1%
2 169904
 
7.1%
8 129201
 
5.4%
6 103192
 
4.3%
5 95718
 
4.0%
3 93888
 
3.9%
/ 82961
 
3.5%
Other values (75) 432445
18.1%
None
ValueCountFrequency (%)
± 585
79.2%
ü 63
 
8.5%
é 35
 
4.7%
ä 28
 
3.8%
â 16
 
2.2%
ó 4
 
0.5%
´ 2
 
0.3%
ï 1
 
0.1%
à 1
 
0.1%
è 1
 
0.1%
Other values (3) 3
 
0.4%
Punctuation
ValueCountFrequency (%)
7
87.5%
1
 
12.5%

island
Text

Missing 

Distinct1621
Distinct (%)1.8%
Missing200031
Missing (%)69.1%
Memory size2.2 MiB
2025-01-14T10:45:26.945063image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length49
Median length47
Mean length6.736609485
Min length3

Characters and Unicode

Total characters603580
Distinct characters85
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique707 ?
Unique (%)0.8%

Sample

1st rowSouth Island
2nd rowVlieland
3rd rowMoluccas
4th rowMoluccas
5th rowMoluccas
ValueCountFrequency (%)
java 34371
32.3%
sumatra 10736
 
10.1%
celebes 5387
 
5.1%
guinea 4479
 
4.2%
new 3703
 
3.5%
borneo 3663
 
3.4%
islands 3174
 
3.0%
texel 2876
 
2.7%
sunda 2297
 
2.2%
lesser 2296
 
2.2%
Other values (1285) 33356
31.4%
2025-01-14T10:45:27.223845image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 133964
22.2%
e 53532
 
8.9%
v 34897
 
5.8%
J 34642
 
5.7%
r 30497
 
5.1%
n 28902
 
4.8%
u 26445
 
4.4%
s 25512
 
4.2%
l 23542
 
3.9%
o 21887
 
3.6%
Other values (75) 189760
31.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 477676
79.1%
Uppercase Letter 106307
 
17.6%
Space Separator 16741
 
2.8%
Other Punctuation 1788
 
0.3%
Open Punctuation 376
 
0.1%
Close Punctuation 376
 
0.1%
Dash Punctuation 313
 
0.1%
Decimal Number 2
 
< 0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 133964
28.0%
e 53532
 
11.2%
v 34897
 
7.3%
r 30497
 
6.4%
n 28902
 
6.1%
u 26445
 
5.5%
s 25512
 
5.3%
l 23542
 
4.9%
o 21887
 
4.6%
i 18350
 
3.8%
Other values (34) 80148
16.8%
Uppercase Letter
ValueCountFrequency (%)
J 34642
32.6%
S 17059
16.0%
C 7961
 
7.5%
B 6702
 
6.3%
T 5800
 
5.5%
G 5520
 
5.2%
I 5479
 
5.2%
N 5205
 
4.9%
M 4460
 
4.2%
L 3694
 
3.5%
Other values (17) 9785
 
9.2%
Other Punctuation
ValueCountFrequency (%)
. 1134
63.4%
, 605
33.8%
? 27
 
1.5%
' 19
 
1.1%
/ 3
 
0.2%
Open Punctuation
ValueCountFrequency (%)
[ 335
89.1%
( 41
 
10.9%
Close Punctuation
ValueCountFrequency (%)
] 335
89.1%
) 41
 
10.9%
Decimal Number
ValueCountFrequency (%)
0 1
50.0%
1 1
50.0%
Space Separator
ValueCountFrequency (%)
16741
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 313
100.0%
Math Symbol
ValueCountFrequency (%)
= 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 583983
96.8%
Common 19597
 
3.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 133964
22.9%
e 53532
 
9.2%
v 34897
 
6.0%
J 34642
 
5.9%
r 30497
 
5.2%
n 28902
 
4.9%
u 26445
 
4.5%
s 25512
 
4.4%
l 23542
 
4.0%
o 21887
 
3.7%
Other values (61) 170163
29.1%
Common
ValueCountFrequency (%)
16741
85.4%
. 1134
 
5.8%
, 605
 
3.1%
[ 335
 
1.7%
] 335
 
1.7%
- 313
 
1.6%
( 41
 
0.2%
) 41
 
0.2%
? 27
 
0.1%
' 19
 
0.1%
Other values (4) 6
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 601599
99.7%
None 1981
 
0.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 133964
22.3%
e 53532
 
8.9%
v 34897
 
5.8%
J 34642
 
5.8%
r 30497
 
5.1%
n 28902
 
4.8%
u 26445
 
4.4%
s 25512
 
4.2%
l 23542
 
3.9%
o 21887
 
3.6%
Other values (55) 187779
31.2%
None
ValueCountFrequency (%)
ç 1159
58.5%
ë 257
 
13.0%
é 196
 
9.9%
ø 169
 
8.5%
ö 100
 
5.0%
Ö 38
 
1.9%
á 11
 
0.6%
ü 10
 
0.5%
ã 9
 
0.5%
í 9
 
0.5%
Other values (10) 23
 
1.2%

country
Text

Missing 

Distinct955
Distinct (%)0.4%
Missing45132
Missing (%)15.6%
Memory size2.2 MiB
2025-01-14T10:45:27.430807image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length726566
Median length35
Mean length12.12260732
Min length1

Characters and Unicode

Total characters2963929
Distinct characters98
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique318 ?
Unique (%)0.1%

Sample

1st rowNetherlands
2nd rowAustralia
3rd rowAustralia
4th rowAustralia
5th rowSenegal
ValueCountFrequency (%)
indonesia 77317
25.0%
netherlands 71334
23.1%
suriname 13444
 
4.3%
kenya 3717
 
1.2%
brazil 3487
 
1.1%
australia 3352
 
1.1%
colombia 3024
 
1.0%
africa 2965
 
1.0%
united 2726
 
0.9%
south 2679
 
0.9%
Other values (8952) 125394
40.5%
2025-01-14T10:45:27.711735image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 313710
 
10.6%
e 305059
 
10.3%
a 290861
 
9.8%
240032
 
8.1%
s 192751
 
6.5%
i 181345
 
6.1%
d 178989
 
6.0%
r 136005
 
4.6%
l 117280
 
4.0%
o 116282
 
3.9%
Other values (88) 891615
30.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2231289
75.3%
Uppercase Letter 330487
 
11.2%
Control 241302
 
8.1%
Decimal Number 84702
 
2.9%
Other Punctuation 34208
 
1.2%
Space Separator 29500
 
1.0%
Dash Punctuation 4687
 
0.2%
Open Punctuation 3594
 
0.1%
Close Punctuation 3591
 
0.1%
Math Symbol 565
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 313710
14.1%
e 305059
13.7%
a 290861
13.0%
s 192751
8.6%
i 181345
8.1%
d 178989
8.0%
r 136005
6.1%
l 117280
 
5.3%
o 116282
 
5.2%
t 114977
 
5.2%
Other values (30) 284030
12.7%
Uppercase Letter
ValueCountFrequency (%)
I 83868
25.4%
N 81553
24.7%
S 32962
 
10.0%
A 22269
 
6.7%
C 16141
 
4.9%
T 10848
 
3.3%
E 9595
 
2.9%
G 7812
 
2.4%
M 7668
 
2.3%
R 7514
 
2.3%
Other values (17) 50257
15.2%
Decimal Number
ValueCountFrequency (%)
1 14116
16.7%
0 14092
16.6%
2 10470
12.4%
6 8215
9.7%
8 7625
9.0%
4 7138
8.4%
3 6084
7.2%
5 6035
7.1%
7 5670
6.7%
9 5257
 
6.2%
Other Punctuation
ValueCountFrequency (%)
. 16392
47.9%
/ 13313
38.9%
: 2540
 
7.4%
, 1449
 
4.2%
& 281
 
0.8%
? 116
 
0.3%
; 57
 
0.2%
' 56
 
0.2%
" 4
 
< 0.1%
Control
ValueCountFrequency (%)
240032
99.5%
1270
 
0.5%
Space Separator
ValueCountFrequency (%)
29496
> 99.9%
  4
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 4615
98.5%
72
 
1.5%
Open Punctuation
ValueCountFrequency (%)
( 3565
99.2%
[ 29
 
0.8%
Close Punctuation
ValueCountFrequency (%)
) 3562
99.2%
] 29
 
0.8%
Math Symbol
ValueCountFrequency (%)
| 565
100.0%
Format
ValueCountFrequency (%)
4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2561776
86.4%
Common 402153
 
13.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 313710
12.2%
e 305059
11.9%
a 290861
11.4%
s 192751
 
7.5%
i 181345
 
7.1%
d 178989
 
7.0%
r 136005
 
5.3%
l 117280
 
4.6%
o 116282
 
4.5%
t 114977
 
4.5%
Other values (57) 614517
24.0%
Common
ValueCountFrequency (%)
240032
59.7%
29496
 
7.3%
. 16392
 
4.1%
1 14116
 
3.5%
0 14092
 
3.5%
/ 13313
 
3.3%
2 10470
 
2.6%
6 8215
 
2.0%
8 7625
 
1.9%
4 7138
 
1.8%
Other values (21) 41264
 
10.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2962939
> 99.9%
None 914
 
< 0.1%
Punctuation 76
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 313710
 
10.6%
e 305059
 
10.3%
a 290861
 
9.8%
240032
 
8.1%
s 192751
 
6.5%
i 181345
 
6.1%
d 178989
 
6.0%
r 136005
 
4.6%
l 117280
 
4.0%
o 116282
 
3.9%
Other values (70) 890625
30.1%
None
ValueCountFrequency (%)
ë 314
34.4%
é 242
26.5%
ü 174
19.0%
ç 47
 
5.1%
ô 44
 
4.8%
ã 33
 
3.6%
ä 23
 
2.5%
í 19
 
2.1%
  4
 
0.4%
ê 2
 
0.2%
Other values (6) 12
 
1.3%
Punctuation
ValueCountFrequency (%)
72
94.7%
4
 
5.3%

stateProvince
Text

Missing 

Distinct7165
Distinct (%)4.7%
Missing136488
Missing (%)47.1%
Memory size2.2 MiB
2025-01-14T10:45:27.915596image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length80
Median length71
Mean length11.67741282
Min length1

Characters and Unicode

Total characters1788279
Distinct characters115
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3135 ?
Unique (%)2.0%

Sample

1st rowSouth Holland
2nd rowNew South Wales
3rd rowSouth Australia
4th rowQueensland
5th rowFriesland
ValueCountFrequency (%)
holland 26720
 
10.7%
north 19018
 
7.6%
south 12914
 
5.2%
preanger 9150
 
3.7%
java 8836
 
3.5%
gelderland 6559
 
2.6%
friesland 4323
 
1.7%
guinea 4254
 
1.7%
overijssel 3397
 
1.4%
utrecht 3319
 
1.3%
Other values (5322) 150886
60.5%
2025-01-14T10:45:28.206838image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 200865
 
11.2%
e 141835
 
7.9%
n 125071
 
7.0%
r 121778
 
6.8%
l 120964
 
6.8%
o 110545
 
6.2%
96244
 
5.4%
t 82822
 
4.6%
i 75442
 
4.2%
d 74476
 
4.2%
Other values (105) 638237
35.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1388212
77.6%
Uppercase Letter 253969
 
14.2%
Space Separator 96244
 
5.4%
Other Punctuation 36534
 
2.0%
Dash Punctuation 10865
 
0.6%
Close Punctuation 1011
 
0.1%
Open Punctuation 1010
 
0.1%
Decimal Number 229
 
< 0.1%
Math Symbol 201
 
< 0.1%
Other Symbol 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 200865
14.5%
e 141835
10.2%
n 125071
9.0%
r 121778
8.8%
l 120964
8.7%
o 110545
8.0%
t 82822
 
6.0%
i 75442
 
5.4%
d 74476
 
5.4%
s 60301
 
4.3%
Other values (41) 274113
19.7%
Uppercase Letter
ValueCountFrequency (%)
N 33515
13.2%
H 30922
12.2%
S 26812
 
10.6%
P 18535
 
7.3%
G 16704
 
6.6%
B 13043
 
5.1%
C 11297
 
4.4%
W 10881
 
4.3%
J 10650
 
4.2%
M 10502
 
4.1%
Other values (21) 71108
28.0%
Decimal Number
ValueCountFrequency (%)
0 92
40.2%
1 62
27.1%
5 13
 
5.7%
4 13
 
5.7%
6 13
 
5.7%
2 11
 
4.8%
9 10
 
4.4%
3 7
 
3.1%
8 4
 
1.7%
7 4
 
1.7%
Other Punctuation
ValueCountFrequency (%)
, 19161
52.4%
. 16671
45.6%
/ 201
 
0.6%
& 170
 
0.5%
' 157
 
0.4%
: 113
 
0.3%
? 50
 
0.1%
" 8
 
< 0.1%
; 3
 
< 0.1%
Math Symbol
ValueCountFrequency (%)
< 83
41.3%
> 83
41.3%
± 27
 
13.4%
= 8
 
4.0%
Close Punctuation
ValueCountFrequency (%)
] 779
77.1%
) 231
 
22.8%
} 1
 
0.1%
Open Punctuation
ValueCountFrequency (%)
[ 777
76.9%
( 232
 
23.0%
{ 1
 
0.1%
Space Separator
ValueCountFrequency (%)
96244
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 10865
100.0%
Other Symbol
ValueCountFrequency (%)
° 2
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1642181
91.8%
Common 146098
 
8.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 200865
 
12.2%
e 141835
 
8.6%
n 125071
 
7.6%
r 121778
 
7.4%
l 120964
 
7.4%
o 110545
 
6.7%
t 82822
 
5.0%
i 75442
 
4.6%
d 74476
 
4.5%
s 60301
 
3.7%
Other values (72) 528082
32.2%
Common
ValueCountFrequency (%)
96244
65.9%
, 19161
 
13.1%
. 16671
 
11.4%
- 10865
 
7.4%
] 779
 
0.5%
[ 777
 
0.5%
( 232
 
0.2%
) 231
 
0.2%
/ 201
 
0.1%
& 170
 
0.1%
Other values (23) 767
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1781922
99.6%
None 6357
 
0.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 200865
 
11.3%
e 141835
 
8.0%
n 125071
 
7.0%
r 121778
 
6.8%
l 120964
 
6.8%
o 110545
 
6.2%
96244
 
5.4%
t 82822
 
4.6%
i 75442
 
4.2%
d 74476
 
4.2%
Other values (73) 631880
35.5%
None
ValueCountFrequency (%)
â 2265
35.6%
ë 2107
33.1%
ä 509
 
8.0%
é 407
 
6.4%
ü 207
 
3.3%
ô 191
 
3.0%
ö 128
 
2.0%
è 126
 
2.0%
á 90
 
1.4%
å 55
 
0.9%
Other values (22) 272
 
4.3%

locality
Text

Missing 

Distinct29689
Distinct (%)14.1%
Missing78963
Missing (%)27.3%
Memory size2.2 MiB
2025-01-14T10:45:28.425765image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19409
Median length93
Mean length16.26432488
Min length2

Characters and Unicode

Total characters3426324
Distinct characters135
Distinct categories16 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique16266 ?
Unique (%)7.7%

Sample

1st rowLisse
2nd rowNew South Wales, no further locality
3rd rowKangaroo I.
4th rowsine loco [SW & SE Australia]
5th rowSenegal, no further locality
ValueCountFrequency (%)
locality 9277
 
1.9%
no 9263
 
1.9%
further 9250
 
1.9%
i 8571
 
1.8%
java 8148
 
1.7%
sine 6339
 
1.3%
loco 6337
 
1.3%
west 5995
 
1.2%
area 5203
 
1.1%
pangerango 4784
 
1.0%
Other values (24964) 411903
84.9%
2025-01-14T10:45:28.725636image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 339907
 
9.9%
e 313933
 
9.2%
273601
 
8.0%
n 233233
 
6.8%
r 209604
 
6.1%
o 207755
 
6.1%
i 173151
 
5.1%
t 131760
 
3.8%
l 129558
 
3.8%
s 107158
 
3.1%
Other values (125) 1306664
38.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2564529
74.8%
Uppercase Letter 394161
 
11.5%
Space Separator 273602
 
8.0%
Other Punctuation 115545
 
3.4%
Decimal Number 19064
 
0.6%
Close Punctuation 18996
 
0.6%
Open Punctuation 18994
 
0.6%
Dash Punctuation 11177
 
0.3%
Control 6080
 
0.2%
Math Symbol 3131
 
0.1%
Other values (6) 1045
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 339907
13.3%
e 313933
12.2%
n 233233
 
9.1%
r 209604
 
8.2%
o 207755
 
8.1%
i 173151
 
6.8%
t 131760
 
5.1%
l 129558
 
5.1%
s 107158
 
4.2%
u 105930
 
4.1%
Other values (44) 612540
23.9%
Uppercase Letter
ValueCountFrequency (%)
S 39583
 
10.0%
B 32492
 
8.2%
M 27887
 
7.1%
P 26854
 
6.8%
W 26118
 
6.6%
N 20241
 
5.1%
K 19942
 
5.1%
H 18010
 
4.6%
T 17964
 
4.6%
L 17349
 
4.4%
Other values (25) 147721
37.5%
Other Punctuation
ValueCountFrequency (%)
, 71188
61.6%
. 22381
 
19.4%
' 8896
 
7.7%
/ 6671
 
5.8%
? 2997
 
2.6%
" 2017
 
1.7%
& 989
 
0.9%
: 285
 
0.2%
! 70
 
0.1%
; 37
 
< 0.1%
Other values (2) 14
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
0 5703
29.9%
1 2825
14.8%
5 2274
 
11.9%
2 2171
 
11.4%
3 1441
 
7.6%
4 1133
 
5.9%
6 969
 
5.1%
8 968
 
5.1%
7 808
 
4.2%
9 772
 
4.0%
Math Symbol
ValueCountFrequency (%)
= 1027
32.8%
> 1022
32.6%
< 995
31.8%
± 50
 
1.6%
| 34
 
1.1%
+ 2
 
0.1%
~ 1
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
[ 12698
66.9%
( 6295
33.1%
{ 1
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
] 12696
66.8%
) 6293
33.1%
} 7
 
< 0.1%
Space Separator
ValueCountFrequency (%)
273601
> 99.9%
  1
 
< 0.1%
Control
ValueCountFrequency (%)
6048
99.5%
32
 
0.5%
Dash Punctuation
ValueCountFrequency (%)
- 11177
100.0%
Other Symbol
ValueCountFrequency (%)
° 615
100.0%
Final Punctuation
ValueCountFrequency (%)
312
100.0%
Initial Punctuation
ValueCountFrequency (%)
64
100.0%
Other Letter
ValueCountFrequency (%)
º 38
100.0%
Other Number
ValueCountFrequency (%)
½ 12
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2958728
86.4%
Common 467596
 
13.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 339907
 
11.5%
e 313933
 
10.6%
n 233233
 
7.9%
r 209604
 
7.1%
o 207755
 
7.0%
i 173151
 
5.9%
t 131760
 
4.5%
l 129558
 
4.4%
s 107158
 
3.6%
u 105930
 
3.6%
Other values (80) 1006739
34.0%
Common
ValueCountFrequency (%)
273601
58.5%
, 71188
 
15.2%
. 22381
 
4.8%
[ 12698
 
2.7%
] 12696
 
2.7%
- 11177
 
2.4%
' 8896
 
1.9%
/ 6671
 
1.4%
( 6295
 
1.3%
) 6293
 
1.3%
Other values (35) 35700
 
7.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3419656
99.8%
None 6289
 
0.2%
Punctuation 379
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 339907
 
9.9%
e 313933
 
9.2%
273601
 
8.0%
n 233233
 
6.8%
r 209604
 
6.1%
o 207755
 
6.1%
i 173151
 
5.1%
t 131760
 
3.9%
l 129558
 
3.8%
s 107158
 
3.1%
Other values (80) 1299996
38.0%
None
ValueCountFrequency (%)
é 1758
28.0%
ö 718
11.4%
° 615
 
9.8%
ä 573
 
9.1%
â 465
 
7.4%
ü 379
 
6.0%
ë 339
 
5.4%
è 184
 
2.9%
å 160
 
2.5%
Ö 130
 
2.1%
Other values (32) 968
15.4%
Punctuation
ValueCountFrequency (%)
312
82.3%
64
 
16.9%
3
 
0.8%

verbatimElevation
Text

Missing 

Distinct716
Distinct (%)27.7%
Missing287041
Missing (%)99.1%
Memory size2.2 MiB
2025-01-14T10:45:28.870802image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length30
Median length27
Mean length7.081175106
Min length2

Characters and Unicode

Total characters18319
Distinct characters57
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique421 ?
Unique (%)16.3%

Sample

1st row1700 m.
2nd row± 100 Meter
3rd row± 100 m
4th rowasc 3000 ft
5th row7000'
ValueCountFrequency (%)
m 1564
30.9%
meter 212
 
4.2%
ft 177
 
3.5%
± 168
 
3.3%
6000 137
 
2.7%
7000 121
 
2.4%
1000 106
 
2.1%
900 102
 
2.0%
1800 101
 
2.0%
3000 101
 
2.0%
Other values (358) 2280
45.0%
2025-01-14T10:45:29.073194image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 5678
31.0%
2483
13.6%
m 1262
 
6.9%
1 1022
 
5.6%
. 814
 
4.4%
5 685
 
3.7%
M 616
 
3.4%
e 596
 
3.3%
' 548
 
3.0%
2 519
 
2.8%
Other values (47) 4096
22.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 10006
54.6%
Lowercase Letter 3329
 
18.2%
Space Separator 2483
 
13.6%
Other Punctuation 1432
 
7.8%
Uppercase Letter 663
 
3.6%
Math Symbol 202
 
1.1%
Dash Punctuation 194
 
1.1%
Open Punctuation 5
 
< 0.1%
Close Punctuation 5
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
m 1262
37.9%
e 596
17.9%
t 508
15.3%
r 274
 
8.2%
f 214
 
6.4%
a 97
 
2.9%
o 81
 
2.4%
s 62
 
1.9%
z 33
 
1.0%
l 29
 
0.9%
Other values (14) 173
 
5.2%
Uppercase Letter
ValueCountFrequency (%)
M 616
92.9%
X 18
 
2.7%
F 9
 
1.4%
S 6
 
0.9%
E 3
 
0.5%
H 3
 
0.5%
K 2
 
0.3%
Y 2
 
0.3%
L 1
 
0.2%
V 1
 
0.2%
Other values (2) 2
 
0.3%
Decimal Number
ValueCountFrequency (%)
0 5678
56.7%
1 1022
 
10.2%
5 685
 
6.8%
2 519
 
5.2%
6 395
 
3.9%
7 387
 
3.9%
8 384
 
3.8%
4 355
 
3.5%
3 345
 
3.4%
9 236
 
2.4%
Other Punctuation
ValueCountFrequency (%)
. 814
56.8%
' 548
38.3%
, 66
 
4.6%
: 3
 
0.2%
/ 1
 
0.1%
Math Symbol
ValueCountFrequency (%)
± 196
97.0%
+ 6
 
3.0%
Space Separator
ValueCountFrequency (%)
2483
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 194
100.0%
Open Punctuation
ValueCountFrequency (%)
( 5
100.0%
Close Punctuation
ValueCountFrequency (%)
) 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 14327
78.2%
Latin 3992
 
21.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
m 1262
31.6%
M 616
15.4%
e 596
14.9%
t 508
12.7%
r 274
 
6.9%
f 214
 
5.4%
a 97
 
2.4%
o 81
 
2.0%
s 62
 
1.6%
z 33
 
0.8%
Other values (26) 249
 
6.2%
Common
ValueCountFrequency (%)
0 5678
39.6%
2483
17.3%
1 1022
 
7.1%
. 814
 
5.7%
5 685
 
4.8%
' 548
 
3.8%
2 519
 
3.6%
6 395
 
2.8%
7 387
 
2.7%
8 384
 
2.7%
Other values (11) 1412
 
9.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 18121
98.9%
None 198
 
1.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 5678
31.3%
2483
13.7%
m 1262
 
7.0%
1 1022
 
5.6%
. 814
 
4.5%
5 685
 
3.8%
M 616
 
3.4%
e 596
 
3.3%
' 548
 
3.0%
2 519
 
2.9%
Other values (45) 3898
21.5%
None
ValueCountFrequency (%)
± 196
99.0%
ü 2
 
1.0%

locationAccordingTo
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing289627
Missing (%)> 99.9%
Memory size2.2 MiB
2025-01-14T10:45:29.123760image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters7
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row45.0083
ValueCountFrequency (%)
45.0083 1
100.0%
2025-01-14T10:45:29.221096image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 2
28.6%
4 1
14.3%
5 1
14.3%
. 1
14.3%
8 1
14.3%
3 1
14.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6
85.7%
Other Punctuation 1
 
14.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 2
33.3%
4 1
16.7%
5 1
16.7%
8 1
16.7%
3 1
16.7%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 7
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 2
28.6%
4 1
14.3%
5 1
14.3%
. 1
14.3%
8 1
14.3%
3 1
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 2
28.6%
4 1
14.3%
5 1
14.3%
. 1
14.3%
8 1
14.3%
3 1
14.3%

locationRemarks
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing289627
Missing (%)> 99.9%
Memory size2.2 MiB
2025-01-14T10:45:29.265815image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters8
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row128.0083
ValueCountFrequency (%)
128.0083 1
100.0%
2025-01-14T10:45:29.362913image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8 2
25.0%
0 2
25.0%
1 1
12.5%
2 1
12.5%
. 1
12.5%
3 1
12.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 7
87.5%
Other Punctuation 1
 
12.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
8 2
28.6%
0 2
28.6%
1 1
14.3%
2 1
14.3%
3 1
14.3%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 8
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
8 2
25.0%
0 2
25.0%
1 1
12.5%
2 1
12.5%
. 1
12.5%
3 1
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
8 2
25.0%
0 2
25.0%
1 1
12.5%
2 1
12.5%
. 1
12.5%
3 1
12.5%

decimalLatitude
Text

Missing 

Distinct8258
Distinct (%)5.4%
Missing136554
Missing (%)47.1%
Memory size2.2 MiB
2025-01-14T10:45:29.566308image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length17
Median length11
Mean length6.164364948
Min length3

Characters and Unicode

Total characters943604
Distinct characters15
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2599 ?
Unique (%)1.7%

Sample

1st row52.25
2nd row-35.8417
3rd row13.5
4th row-45.15267
5th row-13.4
ValueCountFrequency (%)
6.7667 1821
 
1.2%
52.2417 1243
 
0.8%
6.5833 1111
 
0.7%
6.775 1102
 
0.7%
52.175 936
 
0.6%
5.9417 858
 
0.6%
52.1 846
 
0.6%
3.5917 832
 
0.5%
53.3917 829
 
0.5%
52.3583 813
 
0.5%
Other values (7317) 142683
93.2%
2025-01-14T10:45:29.844769image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 153073
16.2%
5 138524
14.7%
3 108304
11.5%
1 88303
9.4%
2 84464
9.0%
7 76372
8.1%
8 60368
 
6.4%
6 56510
 
6.0%
0 52273
 
5.5%
4 49409
 
5.2%
Other values (5) 76004
8.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 745833
79.0%
Other Punctuation 153073
 
16.2%
Dash Punctuation 44695
 
4.7%
Uppercase Letter 3
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 138524
18.6%
3 108304
14.5%
1 88303
11.8%
2 84464
11.3%
7 76372
10.2%
8 60368
8.1%
6 56510
7.6%
0 52273
 
7.0%
4 49409
 
6.6%
9 31306
 
4.2%
Uppercase Letter
ValueCountFrequency (%)
W 1
33.3%
G 1
33.3%
S 1
33.3%
Other Punctuation
ValueCountFrequency (%)
. 153073
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 44695
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 943601
> 99.9%
Latin 3
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
. 153073
16.2%
5 138524
14.7%
3 108304
11.5%
1 88303
9.4%
2 84464
9.0%
7 76372
8.1%
8 60368
 
6.4%
6 56510
 
6.0%
0 52273
 
5.5%
4 49409
 
5.2%
Other values (2) 76001
8.1%
Latin
ValueCountFrequency (%)
W 1
33.3%
G 1
33.3%
S 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 943604
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 153073
16.2%
5 138524
14.7%
3 108304
11.5%
1 88303
9.4%
2 84464
9.0%
7 76372
8.1%
8 60368
 
6.4%
6 56510
 
6.0%
0 52273
 
5.5%
4 49409
 
5.2%
Other values (5) 76004
8.1%

decimalLongitude
Text

Missing 

Distinct10150
Distinct (%)6.6%
Missing135979
Missing (%)46.9%
Memory size2.2 MiB
2025-01-14T10:45:30.062275image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length18
Median length17
Mean length6.284388444
Min length3

Characters and Unicode

Total characters965590
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3552 ?
Unique (%)2.3%

Sample

1st row4.5333
2nd row137.5083
3rd row-16.0
4th row169.89263
5th row48.27
ValueCountFrequency (%)
106.9167 1795
 
1.2%
107.0 1161
 
0.8%
106.925 1127
 
0.7%
106.8 1065
 
0.7%
4.875 975
 
0.6%
124.8583 748
 
0.5%
4.425 748
 
0.5%
98.675 716
 
0.5%
106.825 699
 
0.5%
6.1 699
 
0.5%
Other values (9278) 143916
93.7%
2025-01-14T10:45:30.346761image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 153649
15.9%
1 124246
12.9%
5 103162
10.7%
3 91574
9.5%
7 85735
8.9%
4 75127
7.8%
0 74503
7.7%
8 73203
7.6%
6 64305
6.7%
2 52724
 
5.5%
Other values (2) 67362
7.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 790611
81.9%
Other Punctuation 153649
 
15.9%
Dash Punctuation 21330
 
2.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 124246
15.7%
5 103162
13.0%
3 91574
11.6%
7 85735
10.8%
4 75127
9.5%
0 74503
9.4%
8 73203
9.3%
6 64305
8.1%
2 52724
6.7%
9 46032
 
5.8%
Other Punctuation
ValueCountFrequency (%)
. 153649
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 21330
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 965590
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 153649
15.9%
1 124246
12.9%
5 103162
10.7%
3 91574
9.5%
7 85735
8.9%
4 75127
7.8%
0 74503
7.7%
8 73203
7.6%
6 64305
6.7%
2 52724
 
5.5%
Other values (2) 67362
7.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 965590
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 153649
15.9%
1 124246
12.9%
5 103162
10.7%
3 91574
9.5%
7 85735
8.9%
4 75127
7.8%
0 74503
7.7%
8 73203
7.6%
6 64305
6.7%
2 52724
 
5.5%
Other values (2) 67362
7.0%

geodeticDatum
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing1
Missing (%)< 0.1%
Memory size2.2 MiB
2025-01-14T10:45:30.403788image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters1448135
Distinct characters5
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowWGS84
2nd rowWGS84
3rd rowWGS84
4th rowWGS84
5th rowWGS84
ValueCountFrequency (%)
wgs84 289627
100.0%
2025-01-14T10:45:30.500904image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
W 289627
20.0%
G 289627
20.0%
S 289627
20.0%
8 289627
20.0%
4 289627
20.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 868881
60.0%
Decimal Number 579254
40.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
W 289627
33.3%
G 289627
33.3%
S 289627
33.3%
Decimal Number
ValueCountFrequency (%)
8 289627
50.0%
4 289627
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 868881
60.0%
Common 579254
40.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
W 289627
33.3%
G 289627
33.3%
S 289627
33.3%
Common
ValueCountFrequency (%)
8 289627
50.0%
4 289627
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1448135
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
W 289627
20.0%
G 289627
20.0%
S 289627
20.0%
8 289627
20.0%
4 289627
20.0%
Distinct172
Distinct (%)10.4%
Missing287974
Missing (%)99.4%
Memory size2.2 MiB
2025-01-14T10:45:30.660329image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length5
Mean length3.42140266
Min length1

Characters and Unicode

Total characters5659
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique72 ?
Unique (%)4.4%

Sample

1st row640000
2nd row20000
3rd row640000
4th row1000
5th row1
ValueCountFrequency (%)
5 399
24.1%
82230 128
 
7.7%
60697 87
 
5.3%
100 71
 
4.3%
216478 65
 
3.9%
1000 48
 
2.9%
2000 47
 
2.8%
200 41
 
2.5%
5196 40
 
2.4%
50 37
 
2.2%
Other values (162) 691
41.8%
2025-01-14T10:45:30.897329image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 1808
31.9%
5 693
 
12.2%
2 579
 
10.2%
6 555
 
9.8%
1 436
 
7.7%
7 384
 
6.8%
4 329
 
5.8%
8 312
 
5.5%
3 300
 
5.3%
9 263
 
4.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 5659
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1808
31.9%
5 693
 
12.2%
2 579
 
10.2%
6 555
 
9.8%
1 436
 
7.7%
7 384
 
6.8%
4 329
 
5.8%
8 312
 
5.5%
3 300
 
5.3%
9 263
 
4.6%

Most occurring scripts

ValueCountFrequency (%)
Common 5659
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1808
31.9%
5 693
 
12.2%
2 579
 
10.2%
6 555
 
9.8%
1 436
 
7.7%
7 384
 
6.8%
4 329
 
5.8%
8 312
 
5.5%
3 300
 
5.3%
9 263
 
4.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5659
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1808
31.9%
5 693
 
12.2%
2 579
 
10.2%
6 555
 
9.8%
1 436
 
7.7%
7 384
 
6.8%
4 329
 
5.8%
8 312
 
5.5%
3 300
 
5.3%
9 263
 
4.6%

typeStatus
Text

Missing 

Distinct6
Distinct (%)0.2%
Missing286162
Missing (%)98.8%
Memory size2.2 MiB
2025-01-14T10:45:30.957927image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length7
Mean length7.704847086
Min length4

Characters and Unicode

Total characters26705
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowsyntype
2nd rowsyntype
3rd rowsyntype
4th rowparatype
5th rowparatype
ValueCountFrequency (%)
syntype 2273
65.6%
paratype 500
 
14.4%
holotype 369
 
10.6%
paralectotype 239
 
6.9%
lectotype 79
 
2.3%
type 6
 
0.2%
2025-01-14T10:45:31.069834image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
y 5739
21.5%
p 4205
15.7%
t 3784
14.2%
e 3784
14.2%
s 2273
 
8.5%
n 2273
 
8.5%
a 1478
 
5.5%
o 1056
 
4.0%
r 739
 
2.8%
l 687
 
2.6%
Other values (2) 687
 
2.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 26705
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
y 5739
21.5%
p 4205
15.7%
t 3784
14.2%
e 3784
14.2%
s 2273
 
8.5%
n 2273
 
8.5%
a 1478
 
5.5%
o 1056
 
4.0%
r 739
 
2.8%
l 687
 
2.6%
Other values (2) 687
 
2.6%

Most occurring scripts

ValueCountFrequency (%)
Latin 26705
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
y 5739
21.5%
p 4205
15.7%
t 3784
14.2%
e 3784
14.2%
s 2273
 
8.5%
n 2273
 
8.5%
a 1478
 
5.5%
o 1056
 
4.0%
r 739
 
2.8%
l 687
 
2.6%
Other values (2) 687
 
2.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 26705
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
y 5739
21.5%
p 4205
15.7%
t 3784
14.2%
e 3784
14.2%
s 2273
 
8.5%
n 2273
 
8.5%
a 1478
 
5.5%
o 1056
 
4.0%
r 739
 
2.8%
l 687
 
2.6%
Other values (2) 687
 
2.6%

identifiedBy
Text

Missing 

Distinct48
Distinct (%)11.7%
Missing289216
Missing (%)99.9%
Memory size2.2 MiB
2025-01-14T10:45:31.175457image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length25
Median length9
Mean length9.708737864
Min length4

Characters and Unicode

Total characters4000
Distinct characters58
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique21 ?
Unique (%)5.1%

Sample

1st rowRijswijk C. van
2nd rowKonter A.
3rd rowKonter A.
4th rowVoous of Wattel?
5th rowVoous
ValueCountFrequency (%)
konter 165
20.0%
a 165
20.0%
dekker 113
13.7%
r 113
13.7%
voous 32
 
3.9%
roselaar 21
 
2.5%
jansen 11
 
1.3%
j.f.j 11
 
1.3%
k 11
 
1.3%
of 9
 
1.1%
Other values (72) 173
21.0%
2025-01-14T10:45:31.356904image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 527
13.2%
412
10.3%
. 408
10.2%
r 342
 
8.6%
o 283
 
7.1%
k 242
 
6.0%
n 218
 
5.5%
t 206
 
5.1%
K 184
 
4.6%
A 166
 
4.2%
Other values (48) 1012
25.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2283
57.1%
Uppercase Letter 833
 
20.8%
Other Punctuation 418
 
10.4%
Space Separator 412
 
10.3%
Decimal Number 48
 
1.2%
Open Punctuation 3
 
0.1%
Close Punctuation 3
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 527
23.1%
r 342
15.0%
o 283
12.4%
k 242
10.6%
n 218
9.5%
t 206
 
9.0%
a 108
 
4.7%
s 95
 
4.2%
l 60
 
2.6%
u 41
 
1.8%
Other values (13) 161
 
7.1%
Uppercase Letter
ValueCountFrequency (%)
K 184
22.1%
A 166
19.9%
R 137
16.4%
D 121
14.5%
V 43
 
5.2%
J 36
 
4.3%
P 22
 
2.6%
S 19
 
2.3%
W 16
 
1.9%
C 15
 
1.8%
Other values (11) 74
8.9%
Decimal Number
ValueCountFrequency (%)
0 13
27.1%
1 11
22.9%
2 10
20.8%
3 8
16.7%
5 2
 
4.2%
9 2
 
4.2%
8 1
 
2.1%
4 1
 
2.1%
Other Punctuation
ValueCountFrequency (%)
. 408
97.6%
? 9
 
2.2%
& 1
 
0.2%
Space Separator
ValueCountFrequency (%)
412
100.0%
Open Punctuation
ValueCountFrequency (%)
( 3
100.0%
Close Punctuation
ValueCountFrequency (%)
) 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3116
77.9%
Common 884
 
22.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 527
16.9%
r 342
11.0%
o 283
9.1%
k 242
 
7.8%
n 218
 
7.0%
t 206
 
6.6%
K 184
 
5.9%
A 166
 
5.3%
R 137
 
4.4%
D 121
 
3.9%
Other values (34) 690
22.1%
Common
ValueCountFrequency (%)
412
46.6%
. 408
46.2%
0 13
 
1.5%
1 11
 
1.2%
2 10
 
1.1%
? 9
 
1.0%
3 8
 
0.9%
( 3
 
0.3%
) 3
 
0.3%
5 2
 
0.2%
Other values (4) 5
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 527
13.2%
412
10.3%
. 408
10.2%
r 342
 
8.6%
o 283
 
7.1%
k 242
 
6.0%
n 218
 
5.5%
t 206
 
5.1%
K 184
 
4.6%
A 166
 
4.2%
Other values (48) 1012
25.3%

dateIdentified
Text

Missing 

Distinct40
Distinct (%)15.6%
Missing289371
Missing (%)99.9%
Memory size2.2 MiB
2025-01-14T10:45:31.478307image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters2570
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique25 ?
Unique (%)9.7%

Sample

1st row2022/07/01
2nd row2022/04/25
3rd row2022/04/25
4th row1964/01/01
5th row2022/04/25
ValueCountFrequency (%)
2022/04/25 165
64.2%
2018/05/31 13
 
5.1%
2021/07/01 11
 
4.3%
1964/01/01 10
 
3.9%
2014/10/28 7
 
2.7%
2014/10/20 4
 
1.6%
2023/12/28 3
 
1.2%
2022/08/31 3
 
1.2%
2017/04/17 3
 
1.2%
2023/01/01 3
 
1.2%
Other values (30) 35
 
13.6%
2025-01-14T10:45:31.632490image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 814
31.7%
0 550
21.4%
/ 514
20.0%
4 195
 
7.6%
5 184
 
7.2%
1 175
 
6.8%
8 38
 
1.5%
3 33
 
1.3%
7 28
 
1.1%
9 23
 
0.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2056
80.0%
Other Punctuation 514
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 814
39.6%
0 550
26.8%
4 195
 
9.5%
5 184
 
8.9%
1 175
 
8.5%
8 38
 
1.8%
3 33
 
1.6%
7 28
 
1.4%
9 23
 
1.1%
6 16
 
0.8%
Other Punctuation
ValueCountFrequency (%)
/ 514
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2570
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 814
31.7%
0 550
21.4%
/ 514
20.0%
4 195
 
7.6%
5 184
 
7.2%
1 175
 
6.8%
8 38
 
1.5%
3 33
 
1.3%
7 28
 
1.1%
9 23
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2570
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 814
31.7%
0 550
21.4%
/ 514
20.0%
4 195
 
7.6%
5 184
 
7.2%
1 175
 
6.8%
8 38
 
1.5%
3 33
 
1.3%
7 28
 
1.1%
9 23
 
0.9%

namePublishedInID
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing289627
Missing (%)> 99.9%
Memory size2.2 MiB
2025-01-14T10:45:31.689747image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length33
Median length33
Mean length33
Min length33

Characters and Unicode

Total characters33
Distinct characters18
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowCrossoptilon mantchuricum Swinhoe
ValueCountFrequency (%)
crossoptilon 1
33.3%
mantchuricum 1
33.3%
swinhoe 1
33.3%
2025-01-14T10:45:31.792695image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 4
12.1%
i 3
 
9.1%
n 3
 
9.1%
2
 
6.1%
h 2
 
6.1%
s 2
 
6.1%
t 2
 
6.1%
r 2
 
6.1%
m 2
 
6.1%
u 2
 
6.1%
Other values (8) 9
27.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 29
87.9%
Space Separator 2
 
6.1%
Uppercase Letter 2
 
6.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 4
13.8%
i 3
10.3%
n 3
10.3%
h 2
 
6.9%
s 2
 
6.9%
t 2
 
6.9%
r 2
 
6.9%
m 2
 
6.9%
u 2
 
6.9%
c 2
 
6.9%
Other values (5) 5
17.2%
Uppercase Letter
ValueCountFrequency (%)
S 1
50.0%
C 1
50.0%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 31
93.9%
Common 2
 
6.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 4
12.9%
i 3
9.7%
n 3
9.7%
h 2
 
6.5%
s 2
 
6.5%
t 2
 
6.5%
r 2
 
6.5%
m 2
 
6.5%
u 2
 
6.5%
c 2
 
6.5%
Other values (7) 7
22.6%
Common
ValueCountFrequency (%)
2
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 33
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 4
12.1%
i 3
 
9.1%
n 3
 
9.1%
2
 
6.1%
h 2
 
6.1%
s 2
 
6.1%
t 2
 
6.1%
r 2
 
6.1%
m 2
 
6.1%
u 2
 
6.1%
Other values (8) 9
27.3%
Distinct27724
Distinct (%)9.6%
Missing1
Missing (%)< 0.1%
Memory size2.2 MiB
2025-01-14T10:45:31.994764image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length122
Median length73
Mean length38.16476019
Min length3

Characters and Unicode

Total characters11053545
Distinct characters99
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8762 ?
Unique (%)3.0%

Sample

1st rowVidua orientalis cf Heuglin, 1871
2nd rowTurdus viscivorus viscivorus Linnaeus, 1758
3rd rowNeophema splendida Gould, 1841
4th rowPlatycercus elegans melanopterus North, 1906
5th rowPolytelis anthopeplus monarchoides
ValueCountFrequency (%)
linnaeus 87214
 
6.6%
1758 62801
 
4.8%
temminck 13007
 
1.0%
vieillot 10905
 
0.8%
10567
 
0.8%
gmelin 9441
 
0.7%
horsfield 8367
 
0.6%
1766 7967
 
0.6%
1821 5912
 
0.5%
1789 5905
 
0.4%
Other values (11804) 1091525
83.1%
2025-01-14T10:45:32.281936image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1024031
 
9.3%
a 952767
 
8.6%
i 790587
 
7.2%
s 746831
 
6.8%
e 660252
 
6.0%
n 635153
 
5.7%
r 588460
 
5.3%
u 586053
 
5.3%
l 504888
 
4.6%
o 487902
 
4.4%
Other values (89) 4076621
36.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 8034631
72.7%
Space Separator 1024031
 
9.3%
Decimal Number 870121
 
7.9%
Uppercase Letter 603704
 
5.5%
Other Punctuation 281559
 
2.5%
Open Punctuation 119186
 
1.1%
Close Punctuation 119128
 
1.1%
Dash Punctuation 902
 
< 0.1%
Math Symbol 282
 
< 0.1%
Modifier Symbol 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 952767
11.9%
i 790587
9.8%
s 746831
9.3%
e 660252
 
8.2%
n 635153
 
7.9%
r 588460
 
7.3%
u 586053
 
7.3%
l 504888
 
6.3%
o 487902
 
6.1%
t 387033
 
4.8%
Other values (30) 1694705
21.1%
Uppercase Letter
ValueCountFrequency (%)
L 122957
20.4%
P 57005
9.4%
S 50131
8.3%
C 50030
8.3%
T 36709
 
6.1%
A 36433
 
6.0%
G 34956
 
5.8%
B 31313
 
5.2%
M 30686
 
5.1%
H 30416
 
5.0%
Other values (16) 123068
20.4%
Other Punctuation
ValueCountFrequency (%)
, 228648
81.2%
. 41966
 
14.9%
& 9811
 
3.5%
' 556
 
0.2%
? 301
 
0.1%
" 142
 
0.1%
/ 69
 
< 0.1%
: 43
 
< 0.1%
\ 16
 
< 0.1%
! 4
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 254933
29.3%
8 193308
22.2%
7 127177
14.6%
5 82983
 
9.5%
9 43973
 
5.1%
6 42917
 
4.9%
2 39382
 
4.5%
3 33906
 
3.9%
4 26060
 
3.0%
0 25482
 
2.9%
Math Symbol
ValueCountFrequency (%)
< 140
49.6%
> 131
46.5%
= 9
 
3.2%
2
 
0.7%
Close Punctuation
ValueCountFrequency (%)
) 119057
99.9%
] 42
 
< 0.1%
} 29
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 119115
99.9%
[ 71
 
0.1%
Space Separator
ValueCountFrequency (%)
1024031
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 902
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8638325
78.1%
Common 2415210
 
21.9%
Greek 10
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 952767
11.0%
i 790587
 
9.2%
s 746831
 
8.6%
e 660252
 
7.6%
n 635153
 
7.4%
r 588460
 
6.8%
u 586053
 
6.8%
l 504888
 
5.8%
o 487902
 
5.6%
t 387033
 
4.5%
Other values (55) 2298399
26.6%
Common
ValueCountFrequency (%)
1024031
42.4%
1 254933
 
10.6%
, 228648
 
9.5%
8 193308
 
8.0%
7 127177
 
5.3%
( 119115
 
4.9%
) 119057
 
4.9%
5 82983
 
3.4%
9 43973
 
1.8%
6 42917
 
1.8%
Other values (23) 179068
 
7.4%
Greek
ValueCountFrequency (%)
δ 10
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11044356
99.9%
None 9187
 
0.1%
Math Operators 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1024031
 
9.3%
a 952767
 
8.6%
i 790587
 
7.2%
s 746831
 
6.8%
e 660252
 
6.0%
n 635153
 
5.8%
r 588460
 
5.3%
u 586053
 
5.3%
l 504888
 
4.6%
o 487902
 
4.4%
Other values (74) 4067432
36.8%
None
ValueCountFrequency (%)
ü 7400
80.5%
é 471
 
5.1%
ø 465
 
5.1%
ä 379
 
4.1%
á 245
 
2.7%
ö 58
 
0.6%
ï 55
 
0.6%
ë 51
 
0.6%
è 46
 
0.5%
δ 10
 
0.1%
Other values (4) 7
 
0.1%
Math Operators
ValueCountFrequency (%)
2
100.0%

namePublishedIn
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing289627
Missing (%)> 99.9%
Memory size2.2 MiB
2025-01-14T10:45:32.337635image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters8
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowAnimalia
ValueCountFrequency (%)
animalia 1
100.0%
2025-01-14T10:45:32.434717image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 2
25.0%
a 2
25.0%
A 1
12.5%
n 1
12.5%
m 1
12.5%
l 1
12.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7
87.5%
Uppercase Letter 1
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 2
28.6%
a 2
28.6%
n 1
14.3%
m 1
14.3%
l 1
14.3%
Uppercase Letter
ValueCountFrequency (%)
A 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 2
25.0%
a 2
25.0%
A 1
12.5%
n 1
12.5%
m 1
12.5%
l 1
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 2
25.0%
a 2
25.0%
A 1
12.5%
n 1
12.5%
m 1
12.5%
l 1
12.5%

namePublishedInYear
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing289627
Missing (%)> 99.9%
Memory size2.2 MiB
2025-01-14T10:45:32.477907image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters8
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowAnimalia
ValueCountFrequency (%)
animalia 1
100.0%
2025-01-14T10:45:32.576408image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 2
25.0%
a 2
25.0%
A 1
12.5%
n 1
12.5%
m 1
12.5%
l 1
12.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7
87.5%
Uppercase Letter 1
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 2
28.6%
a 2
28.6%
n 1
14.3%
m 1
14.3%
l 1
14.3%
Uppercase Letter
ValueCountFrequency (%)
A 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 2
25.0%
a 2
25.0%
A 1
12.5%
n 1
12.5%
m 1
12.5%
l 1
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 2
25.0%
a 2
25.0%
A 1
12.5%
n 1
12.5%
m 1
12.5%
l 1
12.5%
Distinct310
Distinct (%)0.1%
Missing1
Missing (%)< 0.1%
Memory size2.2 MiB
2025-01-14T10:45:32.688813image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length45
Median length43
Mean length16.59742704
Min length8

Characters and Unicode

Total characters4807063
Distinct characters52
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique22 ?
Unique (%)< 0.1%

Sample

1st rowAnimalia|Viduidae
2nd rowAnimalia|Turdidae
3rd rowAnimalia|Psittacidae
4th rowAnimalia|Psittacidae
5th rowAnimalia|Psittacidae
ValueCountFrequency (%)
animalia 73469
25.3%
animalia|turdidae 13154
 
4.5%
animalia|scolopacidae 10694
 
3.7%
animalia|sylviidae 10286
 
3.5%
animalia|emberizidae 8024
 
2.8%
animalia|fringillidae 7443
 
2.6%
animalia|corvidae 7140
 
2.5%
animalia|ardeidae 5218
 
1.8%
animalia|timaliidae 5010
 
1.7%
animalia|charadriidae 4758
 
1.6%
Other values (298) 145140
50.0%
2025-01-14T10:45:32.891021image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 933491
19.4%
a 914161
19.0%
l 392122
8.2%
n 356546
 
7.4%
m 317726
 
6.6%
A 315297
 
6.6%
e 275552
 
5.7%
d 260453
 
5.4%
| 220567
 
4.6%
r 137646
 
2.9%
Other values (42) 683502
14.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4073456
84.7%
Uppercase Letter 511598
 
10.6%
Math Symbol 220567
 
4.6%
Other Punctuation 733
 
< 0.1%
Space Separator 709
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 933491
22.9%
a 914161
22.4%
l 392122
9.6%
n 356546
 
8.8%
m 317726
 
7.8%
e 275552
 
6.8%
d 260453
 
6.4%
r 137646
 
3.4%
c 97981
 
2.4%
o 93015
 
2.3%
Other values (13) 294763
 
7.2%
Uppercase Letter
ValueCountFrequency (%)
A 315297
61.6%
P 35105
 
6.9%
T 32092
 
6.3%
S 30744
 
6.0%
C 24779
 
4.8%
M 14619
 
2.9%
E 13060
 
2.6%
F 11340
 
2.2%
L 6846
 
1.3%
N 4455
 
0.9%
Other values (12) 23261
 
4.5%
Other Punctuation
ValueCountFrequency (%)
: 679
92.6%
? 39
 
5.3%
/ 12
 
1.6%
, 2
 
0.3%
. 1
 
0.1%
Math Symbol
ValueCountFrequency (%)
| 220567
100.0%
Space Separator
ValueCountFrequency (%)
709
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4585054
95.4%
Common 222009
 
4.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 933491
20.4%
a 914161
19.9%
l 392122
8.6%
n 356546
 
7.8%
m 317726
 
6.9%
A 315297
 
6.9%
e 275552
 
6.0%
d 260453
 
5.7%
r 137646
 
3.0%
c 97981
 
2.1%
Other values (35) 584079
12.7%
Common
ValueCountFrequency (%)
| 220567
99.4%
709
 
0.3%
: 679
 
0.3%
? 39
 
< 0.1%
/ 12
 
< 0.1%
, 2
 
< 0.1%
. 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4807063
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 933491
19.4%
a 914161
19.0%
l 392122
8.2%
n 356546
 
7.4%
m 317726
 
6.6%
A 315297
 
6.6%
e 275552
 
5.7%
d 260453
 
5.4%
| 220567
 
4.6%
r 137646
 
2.9%
Other values (42) 683502
14.2%

kingdom
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing1
Missing (%)< 0.1%
Memory size2.2 MiB
2025-01-14T10:45:32.945433image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters2317016
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAnimalia
2nd rowAnimalia
3rd rowAnimalia
4th rowAnimalia
5th rowAnimalia
ValueCountFrequency (%)
animalia 289627
100.0%
2025-01-14T10:45:33.042044image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 579254
25.0%
a 579254
25.0%
A 289627
12.5%
n 289627
12.5%
m 289627
12.5%
l 289627
12.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2027389
87.5%
Uppercase Letter 289627
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 579254
28.6%
a 579254
28.6%
n 289627
14.3%
m 289627
14.3%
l 289627
14.3%
Uppercase Letter
ValueCountFrequency (%)
A 289627
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2317016
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 579254
25.0%
a 579254
25.0%
A 289627
12.5%
n 289627
12.5%
m 289627
12.5%
l 289627
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2317016
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 579254
25.0%
a 579254
25.0%
A 289627
12.5%
n 289627
12.5%
m 289627
12.5%
l 289627
12.5%

class
Text

Missing 

Distinct2
Distinct (%)0.1%
Missing286898
Missing (%)99.1%
Memory size2.2 MiB
2025-01-14T10:45:33.089068image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters10920
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAves
2nd rowAves
3rd rowAves
4th rowAves
5th rowAves
ValueCountFrequency (%)
aves 2730
100.0%
2025-01-14T10:45:33.187200image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 2730
25.0%
v 2516
23.0%
e 2516
23.0%
s 2516
23.0%
V 214
 
2.0%
E 214
 
2.0%
S 214
 
2.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7548
69.1%
Uppercase Letter 3372
30.9%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 2730
81.0%
V 214
 
6.3%
E 214
 
6.3%
S 214
 
6.3%
Lowercase Letter
ValueCountFrequency (%)
v 2516
33.3%
e 2516
33.3%
s 2516
33.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 10920
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 2730
25.0%
v 2516
23.0%
e 2516
23.0%
s 2516
23.0%
V 214
 
2.0%
E 214
 
2.0%
S 214
 
2.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10920
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 2730
25.0%
v 2516
23.0%
e 2516
23.0%
s 2516
23.0%
V 214
 
2.0%
E 214
 
2.0%
S 214
 
2.0%

order
Text

Missing 

Distinct4
Distinct (%)0.2%
Missing287366
Missing (%)99.2%
Memory size2.2 MiB
2025-01-14T10:45:33.239438image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length13
Mean length12.94429708
Min length4

Characters and Unicode

Total characters29280
Distinct characters15
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)0.1%

Sample

1st rowPasseriformes
2nd rowPasseriformes
3rd rowPasseriformes
4th rowPasseriformes
5th rowPasseriformes
ValueCountFrequency (%)
passeriformes 2248
99.4%
aves 14
 
0.6%
2025-01-14T10:45:33.340979image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 6745
23.0%
e 4497
15.4%
r 4496
15.4%
a 2248
 
7.7%
i 2248
 
7.7%
f 2248
 
7.7%
o 2248
 
7.7%
m 2248
 
7.7%
P 2247
 
7.7%
A 14
 
< 0.1%
Other values (5) 41
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 26980
92.1%
Uppercase Letter 2300
 
7.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 6745
25.0%
e 4497
16.7%
r 4496
16.7%
a 2248
 
8.3%
i 2248
 
8.3%
f 2248
 
8.3%
o 2248
 
8.3%
m 2248
 
8.3%
v 1
 
< 0.1%
p 1
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
P 2247
97.7%
A 14
 
0.6%
V 13
 
0.6%
E 13
 
0.6%
S 13
 
0.6%

Most occurring scripts

ValueCountFrequency (%)
Latin 29280
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 6745
23.0%
e 4497
15.4%
r 4496
15.4%
a 2248
 
7.7%
i 2248
 
7.7%
f 2248
 
7.7%
o 2248
 
7.7%
m 2248
 
7.7%
P 2247
 
7.7%
A 14
 
< 0.1%
Other values (5) 41
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 29280
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 6745
23.0%
e 4497
15.4%
r 4496
15.4%
a 2248
 
7.7%
i 2248
 
7.7%
f 2248
 
7.7%
o 2248
 
7.7%
m 2248
 
7.7%
P 2247
 
7.7%
A 14
 
< 0.1%
Other values (5) 41
 
0.1%

family
Text

Missing 

Distinct247
Distinct (%)0.1%
Missing74054
Missing (%)25.6%
Memory size2.2 MiB
2025-01-14T10:45:33.477925image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length27
Median length22
Mean length10.34113576
Min length6

Characters and Unicode

Total characters2229280
Distinct characters51
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique18 ?
Unique (%)< 0.1%

Sample

1st rowViduidae
2nd rowTurdidae
3rd rowPsittacidae
4th rowPsittacidae
5th rowPsittacidae
ValueCountFrequency (%)
turdidae 13278
 
6.1%
scolopacidae 10694
 
4.9%
sylviidae 10420
 
4.8%
emberizidae 8091
 
3.7%
fringillidae 7502
 
3.5%
corvidae 7196
 
3.3%
ardeidae 5218
 
2.4%
timaliidae 5165
 
2.4%
sturnidae 4769
 
2.2%
pycnonotidae 4762
 
2.2%
Other values (238) 139188
64.4%
2025-01-14T10:45:33.677015image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 351989
15.8%
a 332659
14.9%
e 268539
12.0%
d 260453
11.7%
r 133150
 
6.0%
l 102495
 
4.6%
c 97981
 
4.4%
o 90767
 
4.1%
n 66919
 
3.0%
t 56941
 
2.6%
Other values (41) 467387
21.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2011539
90.2%
Uppercase Letter 216299
 
9.7%
Other Punctuation 733
 
< 0.1%
Space Separator 709
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 351989
17.5%
a 332659
16.5%
e 268539
13.3%
d 260453
12.9%
r 133150
 
6.6%
l 102495
 
5.1%
c 97981
 
4.9%
o 90767
 
4.5%
n 66919
 
3.3%
t 56941
 
2.8%
Other values (13) 249646
12.4%
Uppercase Letter
ValueCountFrequency (%)
P 32858
15.2%
T 32092
14.8%
S 30517
14.1%
C 24779
11.5%
A 22926
10.6%
M 14619
6.8%
E 12833
 
5.9%
F 11340
 
5.2%
L 6846
 
3.2%
N 4455
 
2.1%
Other values (12) 23034
10.6%
Other Punctuation
ValueCountFrequency (%)
: 679
92.6%
? 39
 
5.3%
/ 12
 
1.6%
, 2
 
0.3%
. 1
 
0.1%
Space Separator
ValueCountFrequency (%)
709
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2227838
99.9%
Common 1442
 
0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 351989
15.8%
a 332659
14.9%
e 268539
12.1%
d 260453
11.7%
r 133150
 
6.0%
l 102495
 
4.6%
c 97981
 
4.4%
o 90767
 
4.1%
n 66919
 
3.0%
t 56941
 
2.6%
Other values (35) 465945
20.9%
Common
ValueCountFrequency (%)
709
49.2%
: 679
47.1%
? 39
 
2.7%
/ 12
 
0.8%
, 2
 
0.1%
. 1
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2229280
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 351989
15.8%
a 332659
14.9%
e 268539
12.0%
d 260453
11.7%
r 133150
 
6.0%
l 102495
 
4.6%
c 97981
 
4.4%
o 90767
 
4.1%
n 66919
 
3.0%
t 56941
 
2.6%
Other values (41) 467387
21.0%

tribe
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing289627
Missing (%)> 99.9%
Memory size2.2 MiB
2025-01-14T10:45:33.733038image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters12
Distinct characters9
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowCrossoptilon
ValueCountFrequency (%)
crossoptilon 1
100.0%
2025-01-14T10:45:33.833723image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 3
25.0%
s 2
16.7%
C 1
 
8.3%
r 1
 
8.3%
p 1
 
8.3%
t 1
 
8.3%
i 1
 
8.3%
l 1
 
8.3%
n 1
 
8.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 11
91.7%
Uppercase Letter 1
 
8.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 3
27.3%
s 2
18.2%
r 1
 
9.1%
p 1
 
9.1%
t 1
 
9.1%
i 1
 
9.1%
l 1
 
9.1%
n 1
 
9.1%
Uppercase Letter
ValueCountFrequency (%)
C 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 12
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 3
25.0%
s 2
16.7%
C 1
 
8.3%
r 1
 
8.3%
p 1
 
8.3%
t 1
 
8.3%
i 1
 
8.3%
l 1
 
8.3%
n 1
 
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 3
25.0%
s 2
16.7%
C 1
 
8.3%
r 1
 
8.3%
p 1
 
8.3%
t 1
 
8.3%
i 1
 
8.3%
l 1
 
8.3%
n 1
 
8.3%

genus
Text

Distinct2534
Distinct (%)0.9%
Missing580
Missing (%)0.2%
Memory size2.2 MiB
2025-01-14T10:45:34.003734image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length34
Median length26
Mean length8.144879051
Min length1

Characters and Unicode

Total characters2354261
Distinct characters68
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique306 ?
Unique (%)0.1%

Sample

1st rowVidua
2nd rowTurdus
3rd rowNeophema
4th rowPlatycercus
5th rowPolytelis
ValueCountFrequency (%)
turdus 5647
 
2.0%
larus 4361
 
1.5%
falco 3593
 
1.2%
parus 3588
 
1.2%
corvus 3377
 
1.2%
pycnonotus 3358
 
1.2%
sterna 3246
 
1.1%
passer 3110
 
1.1%
anas 2998
 
1.0%
accipiter 2973
 
1.0%
Other values (2474) 252913
87.5%
2025-01-14T10:45:34.259032image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 250286
 
10.6%
r 185196
 
7.9%
s 184735
 
7.8%
i 178828
 
7.6%
o 171056
 
7.3%
u 166327
 
7.1%
e 131882
 
5.6%
l 130804
 
5.6%
c 112924
 
4.8%
t 106201
 
4.5%
Other values (58) 736022
31.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2072595
88.0%
Uppercase Letter 281276
 
11.9%
Other Punctuation 178
 
< 0.1%
Space Separator 116
 
< 0.1%
Open Punctuation 48
 
< 0.1%
Close Punctuation 48
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 250286
12.1%
r 185196
8.9%
s 184735
8.9%
i 178828
 
8.6%
o 171056
 
8.3%
u 166327
 
8.0%
e 131882
 
6.4%
l 130804
 
6.3%
c 112924
 
5.4%
t 106201
 
5.1%
Other values (20) 454356
21.9%
Uppercase Letter
ValueCountFrequency (%)
P 44553
15.8%
C 41595
14.8%
A 32992
11.7%
T 20867
 
7.4%
S 19949
 
7.1%
M 18719
 
6.7%
L 17004
 
6.0%
E 12050
 
4.3%
D 10139
 
3.6%
G 9462
 
3.4%
Other values (16) 53946
19.2%
Other Punctuation
ValueCountFrequency (%)
. 134
75.3%
? 21
 
11.8%
' 14
 
7.9%
/ 3
 
1.7%
; 3
 
1.7%
" 2
 
1.1%
, 1
 
0.6%
Open Punctuation
ValueCountFrequency (%)
( 45
93.8%
[ 3
 
6.2%
Close Punctuation
ValueCountFrequency (%)
) 45
93.8%
] 3
 
6.2%
Space Separator
ValueCountFrequency (%)
116
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2353861
> 99.9%
Common 390
 
< 0.1%
Greek 10
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 250286
 
10.6%
r 185196
 
7.9%
s 184735
 
7.8%
i 178828
 
7.6%
o 171056
 
7.3%
u 166327
 
7.1%
e 131882
 
5.6%
l 130804
 
5.6%
c 112924
 
4.8%
t 106201
 
4.5%
Other values (45) 735622
31.3%
Common
ValueCountFrequency (%)
. 134
34.4%
116
29.7%
( 45
 
11.5%
) 45
 
11.5%
? 21
 
5.4%
' 14
 
3.6%
/ 3
 
0.8%
; 3
 
0.8%
[ 3
 
0.8%
] 3
 
0.8%
Other values (2) 3
 
0.8%
Greek
ValueCountFrequency (%)
δ 10
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2354212
> 99.9%
None 49
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 250286
 
10.6%
r 185196
 
7.9%
s 184735
 
7.8%
i 178828
 
7.6%
o 171056
 
7.3%
u 166327
 
7.1%
e 131882
 
5.6%
l 130804
 
5.6%
c 112924
 
4.8%
t 106201
 
4.5%
Other values (54) 735973
31.3%
None
ValueCountFrequency (%)
ü 26
53.1%
ï 12
24.5%
δ 10
 
20.4%
ß 1
 
2.0%

subgenus
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing289627
Missing (%)> 99.9%
Memory size2.2 MiB
2025-01-14T10:45:34.320164image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters12
Distinct characters9
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowmantchuricum
ValueCountFrequency (%)
mantchuricum 1
100.0%
2025-01-14T10:45:34.423184image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
m 2
16.7%
c 2
16.7%
u 2
16.7%
a 1
8.3%
n 1
8.3%
t 1
8.3%
h 1
8.3%
r 1
8.3%
i 1
8.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 12
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
m 2
16.7%
c 2
16.7%
u 2
16.7%
a 1
8.3%
n 1
8.3%
t 1
8.3%
h 1
8.3%
r 1
8.3%
i 1
8.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 12
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
m 2
16.7%
c 2
16.7%
u 2
16.7%
a 1
8.3%
n 1
8.3%
t 1
8.3%
h 1
8.3%
r 1
8.3%
i 1
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
m 2
16.7%
c 2
16.7%
u 2
16.7%
a 1
8.3%
n 1
8.3%
t 1
8.3%
h 1
8.3%
r 1
8.3%
i 1
8.3%
Distinct4845
Distinct (%)1.7%
Missing1404
Missing (%)0.5%
Memory size2.2 MiB
2025-01-14T10:45:34.633223image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length67
Median length44
Mean length8.539514405
Min length2

Characters and Unicode

Total characters2461293
Distinct characters80
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique714 ?
Unique (%)0.2%

Sample

1st roworientalis cf
2nd rowviscivorus
3rd rowsplendida
4th rowelegans
5th rowanthopeplus
ValueCountFrequency (%)
alba 2079
 
0.7%
major 1955
 
0.7%
domesticus 1905
 
0.7%
cinerea 1831
 
0.6%
vulgaris 1740
 
0.6%
chloris 1590
 
0.6%
montanus 1543
 
0.5%
chinensis 1505
 
0.5%
cristatus 1485
 
0.5%
glandarius 1450
 
0.5%
Other values (4711) 271837
94.1%
2025-01-14T10:45:34.923227image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 307107
12.5%
i 241479
9.8%
s 237211
9.6%
u 187770
 
7.6%
r 183768
 
7.5%
e 169869
 
6.9%
l 157994
 
6.4%
n 151086
 
6.1%
c 143459
 
5.8%
o 140138
 
5.7%
Other values (70) 541412
22.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2459202
99.9%
Uppercase Letter 767
 
< 0.1%
Space Separator 700
 
< 0.1%
Other Punctuation 388
 
< 0.1%
Decimal Number 148
 
< 0.1%
Close Punctuation 26
 
< 0.1%
Open Punctuation 26
 
< 0.1%
Math Symbol 26
 
< 0.1%
Dash Punctuation 10
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 307107
12.5%
i 241479
9.8%
s 237211
9.6%
u 187770
 
7.6%
r 183768
 
7.5%
e 169869
 
6.9%
l 157994
 
6.4%
n 151086
 
6.1%
c 143459
 
5.8%
o 140138
 
5.7%
Other values (19) 539321
21.9%
Uppercase Letter
ValueCountFrequency (%)
R 284
37.0%
C 162
21.1%
A 95
 
12.4%
M 76
 
9.9%
X 30
 
3.9%
S 26
 
3.4%
L 14
 
1.8%
T 13
 
1.7%
P 12
 
1.6%
Z 10
 
1.3%
Other values (13) 45
 
5.9%
Decimal Number
ValueCountFrequency (%)
1 37
25.0%
8 24
16.2%
3 21
14.2%
7 15
10.1%
4 13
 
8.8%
5 11
 
7.4%
2 9
 
6.1%
0 9
 
6.1%
6 5
 
3.4%
9 4
 
2.7%
Other Punctuation
ValueCountFrequency (%)
. 272
70.1%
? 45
 
11.6%
: 22
 
5.7%
' 15
 
3.9%
/ 10
 
2.6%
" 10
 
2.6%
& 6
 
1.5%
, 5
 
1.3%
! 3
 
0.8%
Math Symbol
ValueCountFrequency (%)
< 13
50.0%
> 11
42.3%
2
 
7.7%
Close Punctuation
ValueCountFrequency (%)
] 14
53.8%
) 12
46.2%
Open Punctuation
ValueCountFrequency (%)
[ 14
53.8%
( 12
46.2%
Space Separator
ValueCountFrequency (%)
700
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 10
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2459969
99.9%
Common 1324
 
0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 307107
12.5%
i 241479
9.8%
s 237211
9.6%
u 187770
 
7.6%
r 183768
 
7.5%
e 169869
 
6.9%
l 157994
 
6.4%
n 151086
 
6.1%
c 143459
 
5.8%
o 140138
 
5.7%
Other values (42) 540088
22.0%
Common
ValueCountFrequency (%)
700
52.9%
. 272
 
20.5%
? 45
 
3.4%
1 37
 
2.8%
8 24
 
1.8%
: 22
 
1.7%
3 21
 
1.6%
7 15
 
1.1%
' 15
 
1.1%
] 14
 
1.1%
Other values (18) 159
 
12.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2461276
> 99.9%
None 15
 
< 0.1%
Math Operators 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 307107
12.5%
i 241479
9.8%
s 237211
9.6%
u 187770
 
7.6%
r 183768
 
7.5%
e 169869
 
6.9%
l 157994
 
6.4%
n 151086
 
6.1%
c 143459
 
5.8%
o 140138
 
5.7%
Other values (66) 541395
22.0%
None
ValueCountFrequency (%)
ü 13
86.7%
à 1
 
6.7%
ö 1
 
6.7%
Math Operators
ValueCountFrequency (%)
2
100.0%

infraspecificEpithet
Text

Missing 

Distinct6953
Distinct (%)3.5%
Missing89169
Missing (%)30.8%
Memory size2.2 MiB
2025-01-14T10:45:35.137829image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length87
Median length48
Mean length8.519253314
Min length1

Characters and Unicode

Total characters1707761
Distinct characters83
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1473 ?
Unique (%)0.7%

Sample

1st rowviscivorus
2nd rowmelanopterus
3rd rowmonarchoides
4th rowrubescens
5th rowmeridionalis
ValueCountFrequency (%)
subsp 2295
 
1.1%
ssp 2260
 
1.1%
domesticus 2258
 
1.1%
vulgaris 1490
 
0.7%
cinerea 1182
 
0.6%
merula 1145
 
0.6%
rubecula 1127
 
0.6%
cf 1062
 
0.5%
javanica 1020
 
0.5%
nisus 1017
 
0.5%
Other values (6582) 187257
92.6%
2025-01-14T10:45:35.431109image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 198590
11.6%
i 182188
10.7%
s 174371
10.2%
r 127411
 
7.5%
e 123135
 
7.2%
u 121483
 
7.1%
n 110399
 
6.5%
l 102692
 
6.0%
o 94165
 
5.5%
c 92151
 
5.4%
Other values (73) 381176
22.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1701862
99.7%
Other Punctuation 2748
 
0.2%
Space Separator 1661
 
0.1%
Uppercase Letter 830
 
< 0.1%
Math Symbol 256
 
< 0.1%
Decimal Number 194
 
< 0.1%
Open Punctuation 74
 
< 0.1%
Close Punctuation 74
 
< 0.1%
Dash Punctuation 62
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 198590
11.7%
i 182188
10.7%
s 174371
10.2%
r 127411
 
7.5%
e 123135
 
7.2%
u 121483
 
7.1%
n 110399
 
6.5%
l 102692
 
6.0%
o 94165
 
5.5%
c 92151
 
5.4%
Other values (22) 375277
22.1%
Uppercase Letter
ValueCountFrequency (%)
R 231
27.8%
H 123
14.8%
M 75
 
9.0%
C 56
 
6.7%
B 55
 
6.6%
I 48
 
5.8%
Y 45
 
5.4%
D 40
 
4.8%
A 36
 
4.3%
S 24
 
2.9%
Other values (13) 97
11.7%
Decimal Number
ValueCountFrequency (%)
8 44
22.7%
1 41
21.1%
5 18
9.3%
6 17
 
8.8%
3 15
 
7.7%
4 14
 
7.2%
9 13
 
6.7%
0 12
 
6.2%
7 10
 
5.2%
2 10
 
5.2%
Other Punctuation
ValueCountFrequency (%)
. 2582
94.0%
' 59
 
2.1%
/ 56
 
2.0%
? 24
 
0.9%
: 21
 
0.8%
, 4
 
0.1%
! 1
 
< 0.1%
& 1
 
< 0.1%
Math Symbol
ValueCountFrequency (%)
< 127
49.6%
> 120
46.9%
= 9
 
3.5%
Close Punctuation
ValueCountFrequency (%)
} 29
39.2%
] 25
33.8%
) 20
27.0%
Open Punctuation
ValueCountFrequency (%)
[ 54
73.0%
( 20
 
27.0%
Space Separator
ValueCountFrequency (%)
1661
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 62
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1702692
99.7%
Common 5069
 
0.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 198590
11.7%
i 182188
10.7%
s 174371
10.2%
r 127411
 
7.5%
e 123135
 
7.2%
u 121483
 
7.1%
n 110399
 
6.5%
l 102692
 
6.0%
o 94165
 
5.5%
c 92151
 
5.4%
Other values (45) 376107
22.1%
Common
ValueCountFrequency (%)
. 2582
50.9%
1661
32.8%
< 127
 
2.5%
> 120
 
2.4%
- 62
 
1.2%
' 59
 
1.2%
/ 56
 
1.1%
[ 54
 
1.1%
8 44
 
0.9%
1 41
 
0.8%
Other values (18) 263
 
5.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1707571
> 99.9%
None 190
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 198590
11.6%
i 182188
10.7%
s 174371
10.2%
r 127411
 
7.5%
e 123135
 
7.2%
u 121483
 
7.1%
n 110399
 
6.5%
l 102692
 
6.0%
o 94165
 
5.5%
c 92151
 
5.4%
Other values (67) 380986
22.3%
None
ValueCountFrequency (%)
ü 61
32.1%
ë 51
26.8%
ï 43
22.6%
ö 30
15.8%
á 3
 
1.6%
ç 2
 
1.1%
Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.2 MiB
2025-01-14T10:45:35.495677image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length9.067072244
Min length5

Characters and Unicode

Total characters2626078
Distinct characters20
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowspecies
2nd rowsubspecies
3rd rowspecies
4th rowsubspecies
5th rowsubspecies
ValueCountFrequency (%)
subspecies 200450
69.2%
species 87771
30.3%
genus 850
 
0.3%
class 400
 
0.1%
family 144
 
< 0.1%
order 12
 
< 0.1%
swinhoe 1
 
< 0.1%
2025-01-14T10:45:35.604753image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 778542
29.6%
e 577305
22.0%
c 288621
 
11.0%
i 288366
 
11.0%
p 288221
 
11.0%
u 201300
 
7.7%
b 200450
 
7.6%
n 851
 
< 0.1%
g 850
 
< 0.1%
a 544
 
< 0.1%
Other values (10) 1028
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2626077
> 99.9%
Uppercase Letter 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 778542
29.6%
e 577305
22.0%
c 288621
 
11.0%
i 288366
 
11.0%
p 288221
 
11.0%
u 201300
 
7.7%
b 200450
 
7.6%
n 851
 
< 0.1%
g 850
 
< 0.1%
a 544
 
< 0.1%
Other values (9) 1027
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
S 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2626078
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 778542
29.6%
e 577305
22.0%
c 288621
 
11.0%
i 288366
 
11.0%
p 288221
 
11.0%
u 201300
 
7.7%
b 200450
 
7.6%
n 851
 
< 0.1%
g 850
 
< 0.1%
a 544
 
< 0.1%
Other values (10) 1028
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2626078
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 778542
29.6%
e 577305
22.0%
c 288621
 
11.0%
i 288366
 
11.0%
p 288221
 
11.0%
u 201300
 
7.7%
b 200450
 
7.6%
n 851
 
< 0.1%
g 850
 
< 0.1%
a 544
 
< 0.1%
Other values (10) 1028
 
< 0.1%
Distinct6059
Distinct (%)2.2%
Missing17143
Missing (%)5.9%
Memory size2.2 MiB
2025-01-14T10:45:35.791884image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length62
Median length39
Mean length13.82024332
Min length1

Characters and Unicode

Total characters3765809
Distinct characters81
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1183 ?
Unique (%)0.4%

Sample

1st rowHeuglin, 1871
2nd rowLinnaeus, 1758
3rd rowGould, 1841
4th rowNorth, 1906
5th rowTemminck, 1823
ValueCountFrequency (%)
linnaeus 87214
 
16.4%
1758 62801
 
11.8%
temminck 13007
 
2.4%
vieillot 10905
 
2.0%
10530
 
2.0%
gmelin 9441
 
1.8%
horsfield 8367
 
1.6%
1766 7967
 
1.5%
1821 5912
 
1.1%
1789 5905
 
1.1%
Other values (1402) 310792
58.3%
2025-01-14T10:45:36.097584image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 267737
 
7.1%
260390
 
6.9%
1 254855
 
6.8%
e 234945
 
6.2%
, 228637
 
6.1%
a 196611
 
5.2%
8 193240
 
5.1%
i 187873
 
5.0%
s 150318
 
4.0%
7 127152
 
3.4%
Other values (71) 1664051
44.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1799096
47.8%
Decimal Number 869779
23.1%
Uppercase Letter 319594
 
8.5%
Other Punctuation 278101
 
7.4%
Space Separator 260390
 
6.9%
Open Punctuation 119038
 
3.2%
Close Punctuation 118980
 
3.2%
Dash Punctuation 830
 
< 0.1%
Modifier Symbol 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 267737
14.9%
e 234945
13.1%
a 196611
10.9%
i 187873
10.4%
s 150318
8.4%
l 113272
 
6.3%
u 110461
 
6.1%
r 92051
 
5.1%
o 82452
 
4.6%
t 65094
 
3.6%
Other values (24) 298282
16.6%
Uppercase Letter
ValueCountFrequency (%)
L 105931
33.1%
S 29884
 
9.4%
G 25470
 
8.0%
B 25260
 
7.9%
H 21030
 
6.6%
T 15802
 
4.9%
V 13298
 
4.2%
P 12360
 
3.9%
R 12049
 
3.8%
M 11804
 
3.7%
Other values (16) 46706
14.6%
Decimal Number
ValueCountFrequency (%)
1 254855
29.3%
8 193240
22.2%
7 127152
14.6%
5 82954
 
9.5%
9 43956
 
5.1%
6 42895
 
4.9%
2 39363
 
4.5%
3 33870
 
3.9%
4 26033
 
3.0%
0 25461
 
2.9%
Other Punctuation
ValueCountFrequency (%)
, 228637
82.2%
. 38965
 
14.0%
& 9804
 
3.5%
' 468
 
0.2%
? 211
 
0.1%
\ 16
 
< 0.1%
Space Separator
ValueCountFrequency (%)
260390
100.0%
Open Punctuation
ValueCountFrequency (%)
( 119038
100.0%
Close Punctuation
ValueCountFrequency (%)
) 118980
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 830
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2118690
56.3%
Common 1647119
43.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 267737
12.6%
e 234945
11.1%
a 196611
 
9.3%
i 187873
 
8.9%
s 150318
 
7.1%
l 113272
 
5.3%
u 110461
 
5.2%
L 105931
 
5.0%
r 92051
 
4.3%
o 82452
 
3.9%
Other values (50) 577039
27.2%
Common
ValueCountFrequency (%)
260390
15.8%
1 254855
15.5%
, 228637
13.9%
8 193240
11.7%
7 127152
7.7%
( 119038
7.2%
) 118980
7.2%
5 82954
 
5.0%
9 43956
 
2.7%
6 42895
 
2.6%
Other values (11) 175022
10.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3756877
99.8%
None 8932
 
0.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 267737
 
7.1%
260390
 
6.9%
1 254855
 
6.8%
e 234945
 
6.3%
, 228637
 
6.1%
a 196611
 
5.2%
8 193240
 
5.1%
i 187873
 
5.0%
s 150318
 
4.0%
7 127152
 
3.4%
Other values (63) 1655119
44.1%
None
ValueCountFrequency (%)
ü 7299
81.7%
é 471
 
5.3%
ø 465
 
5.2%
ä 379
 
4.2%
á 242
 
2.7%
è 46
 
0.5%
ö 27
 
0.3%
û 3
 
< 0.1%

nomenclaturalCode
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing1
Missing (%)< 0.1%
Memory size2.2 MiB
2025-01-14T10:45:36.155066image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters1158508
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowICZN
2nd rowICZN
3rd rowICZN
4th rowICZN
5th rowICZN
ValueCountFrequency (%)
iczn 289627
100.0%
2025-01-14T10:45:36.253398image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
I 289627
25.0%
C 289627
25.0%
Z 289627
25.0%
N 289627
25.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1158508
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
I 289627
25.0%
C 289627
25.0%
Z 289627
25.0%
N 289627
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1158508
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
I 289627
25.0%
C 289627
25.0%
Z 289627
25.0%
N 289627
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1158508
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
I 289627
25.0%
C 289627
25.0%
Z 289627
25.0%
N 289627
25.0%